INTERSPEECH 2010 Tutorial Program

Kernel Engineering for Fast and Easy Design of Natural Language Applications

  • Alessandro Moschitti (University of Trento)


In recent years, a large part of Information Technology research has addressed the use machine learning for automatic system design. Such research has shown that, although the choice of the learning algorithm affects system accuracy, feature engineering more critically impacts the latter. Feature design is also considered the most difficult step as it requires expertise, intuition and deep knowledge about the target problem to obtain suitable attribute-value representations. For example, how to describe syntactic and semantic relationships among words in an utterance to effectively characterize its concepts?

Kernel Methods (KM) are powerful techniques, which can simplify data modeling by defining abstract representations and implicit feature spaces. More in particular, KM allow for: (a) directly using a similarity function between instances in learning algorithms, thus avoiding explicit feature design; and (b) implicitly defining huge feature spaces, e.g. structures can be represented in the substructure space. KM effectiveness has been shown in many fields, e.g. in Bioinformatics, Speech processing, image processing, Computational Linguistics, data mining and so on.

However, the kernel designer needs practical procedures for interpreting and effectively using the above KM properties. For example, she/he needs to know when a similarity function is a valid kernel and how to modify such similarity to make it exploitable by Support Vector Machines (SVMs). Regarding point (b), string and tree kernels are well-known approaches to represent structural properties but, without a suitable tune-up as well as an appropriate input structure definition, they may result ineffective. Moreover, several versions of string and tree kernels exist, thus it is very important to understand the difference among them from theoretical and practical viewpoints. Finally, without a good knowledge of kernel engineering, defining an effective kernel function may result rather difficult.

This tutorial will explain practical recipes to successfully use KM for language applications: first, after an introduction to Support Vector Machines (explained from an application viewpoint), it will explain KM theory with the aim to derive practical aspects from it.

Second, it will present basic kernels, such as linear, polynomial, sequence and tree kernels, by focusing on the implementation, accuracy and efficiency perspectives. Their application to typical NL tasks, e.g. text categorization and question/answer classification will be shown. The aim is to provide practical procedures for the selection and exploitation of the right kernel for the target task.

Third, it will introduce the SVM-Light-TK toolkit (available at http://disi.unitn.it/moschitti/Tree-Kernel.htm), which encodes several kernels in SVMs, along with the associated data structures and its practical use in NL tasks.

Finally, it will illustrate how innovative and effective kernels can be engineered starting from basic kernels and using systematic data transformations. Such know-how allows for a very fast and accurate design of applications even if the underlying language phenomena and properties are still not very well understood. To make the explanation of such techniques clear, practical applications on re-ranking the output of traditional SLU systems, e.g. concept classification, will be presented.

This tutorial is potentially appealing for researchers working in any Interspeech's field and it only requires a basic machine learning background to be understood.


Alessandro Moschitti expertise concerns machine learning approaches to NLP, Information Retrieval and Data Mining, where NLP is the main application domain of his research work. In particular, He has designed applications of supervised and unsupervised learning for Text Categorization, Named Entity Recognition, Co-Reference Resolution, Text Summarization, Textual Entailment Recognition, Question Answering, Semantic Role Labeling, Relation Extraction and Spoken Dialog Systems. Since 2002 He has been modeling NLP applications in the above areas by means of KM and Kernel-based Machines (e.g. SVMs and perceptrons). His contribution on theory and applications is documented by more than 70 papers published in the major international conferences of Computational Linguistics, Machine Learning, Information Retrieval, Data Mining and Speech Processing, e.g. ACL, ICML, ECML, CIKM, ECIR, ICDM and Interspeech, respectively.

This page was last updated on 21-June-2010 3:00 UTC.