INTERSPEECH 2010 Tutorial Program

Speech and Language Technology for Linguists and other Human Scientists

  • Daniel Hirst (Universite de Provence)


Four years ago, the author of this proposal greeted the publication of Coleman 2005 with these words:
It is unfortunate that there is still today an enormous gap between the community of linguists and phoneticians on the one hand and that of engineers and computer scientists on the other. Each community needs the other and, in an ideal world, linguists would provide theoretical frameworks and data which are useful to engineers, while engineers would provide tools which are useful to linguists. The exchange between the two communities, however, is in practice very slow.
Today the gap is still as wide as ever, but more and more researchers from both sides of the fence are feeling the need for direct interaction between the two communities. For human scientists, learning to communicate with engineers and computer scientists may appear a daunting task. The purpose of this tutorial is to offer human scientists a guided tour to a selection of areas of Speech and Language Technology for which it is felt that it is possible to gain a working knowledge without necessarily needing to follow all the technical details. The potential public for this tutorial will be linguists and phoneticians who wish to acquire a working knowledge of speech and language technology for their research. Participants will be introduced to a number of freely available tools for the manipulation of both spoken and written utterances allowing them in particular to test different models of rhythm and melody. Particular emphasis will be laid on reaching a level of competence so that the technology can be used directly by the researchers themselves without the need of outside assistance.


Daniel Hirst is a linguist and phonetician, who has been working in the field of prosody and phonology for nearly forty years. He is at present Directeur de Recherches at the CNRS laboratory Parole et Langage in the University of Provence, Aix-en-Provence, where he co-directs a research team devoted to linguistic models, annotation and interfaces. He is the author of a study of English intonation with a purely functional representation and was responsible for the edition of a major study of the intonation of languages of the world, "Intonation Systems. A Survey of Twenty Languages (Cambridge University Press; 1998)", to which he contributed the chapter on British English as well as an 80 page introduction in which he proposed a new international transcription system for intonation (INTSINT).
He is the founder and current President of the ISCA Special Interest Group on Speech Prosody (SproSIG), organisers of the International Speech Prosody meetings (Aix en Provence 2002; Nara 2004 ; Dresden 2006; Campinas 2008; Chicago 2010).
He has developped software for the automatic analysis of speech prosody. In particular:
  • Momel - an algorithm for the automatic factoring of fundamental frequency contours into two components: a macromelodic component and a micromelodic component.
  • INTSINT - a prosodic equivalent of the International Phonetic Alphabet, originally designed as a descriptive tool for linguistic annotation, INTSINT has since been implemented as an algorithm converting the output of the Momel algorithm to a sequence of discrete tonal symbols which can then be used as input to synthesise a fundamental frequency contour.
  • ProZed - This tool makes it possible to test prosodic models of rhythm and melody using an analysis by synthesis paradigm to derive a synthetic output from an abstract representation of the prosody.
The algorithms have recently been implemented as a plugin to the Praat speech analysis software.

This page was last updated on 21-June-2010 3:00 UTC.