INTERSPEECH 2010 Tutorial Program

Medical Speech Processing - Pathologies, Treatment Assistance, Clinical Trials

  • Elmar Nöth (Erlangen University)
  • Tobias Bocklet (Erlangen University)


Voice and Language is the very foundation of human communication, being able to clearly articulate our intentions is a vital part of being human. However, a lot of diseases or congenital defects can lead to voice, speech and language disorders, impairing our daily life. These can occur at every age, from young childhood to elderly people.

It is important to distinguish between voice, speech and language disorders: While a voice disorder is linked to the primary signal production (e.g. hoarseness, vocal fold paralysis), speech disorders accrue in the more complex process of sound production and modulation (e.g. nasality, problems articulating plosives, ...). Last, language disorders are linked to language development (e.g. the vocabulary size, grammaticality) and planning (e.g. stuttering).

In this tutorial, we give an introduction to a selection of the most common voice, speech and language disorders and their medical pathologies. We show how to cooperate with medical doctors and what technical tools can be integrated to the clinical work flow of speech therapists. With the help of data acquisition tools and therapists, means for automatic assessment can be investigated and validated in clinical trials. These automatic measures can help the therapy by providing an objective and quantitative measure of treatment success.

This tutorial is intended for speech engineers that are interested in medical speech processing and want to learn about the medical background, how to cooperate with medical doctors and how to conduct clinical trials including data acquisition, statistical analysis and privacy issues. We also show results of clinical trials on various pathologies and give a hands-on introduction to the client-server tool that was used for the whole process.


Elmar Nöth obtained his diploma degree in computer science and his doctoral degree at the University of Erlangen-Nuremberg in 1985 and 1990, respectively. From 1985 to 1990 he was a member of the research sta of the Lehrstuhl für Informatik 5 (Mustererkennung), working on the use of prosodic information in automatic speech understanding. Since 1990, he is an assistant professor at the same institute and head of the speech group.
In 2008, he became a tenured full professor at the Lehrstuhl für Informatik 5, focusing on medical applications of speech technologies and strengthening the cooperation with the medical department of the University of Erlangen-Nuremberg. He is one of the founders of the Sympalog company, which markets conversational dialog systems.

Tobias Bocklet received his diploma degree in computer science in 2007 at the University of Erlangen- Nuremberg. Together with his adviser Elmar Nöth, he works towards his doctoral degree on medical applications of speech and speaker recognition focusing on children voice, speech and language development and pathologies. Since 2008, he collaborates with the speech group of SRI International, contributing to their NIST speaker ID evaluation system.

This page was last updated on 21-June-2010 3:00 UTC.