Speech Communication 33 (2001) 1±4
In the last 20 years, there has been a pressing need to develop speech and language corpora as training
and testing material for a wide range of speech technology applications. This has been coupled with a
growing interest in the speech community to develop models of spoken language that are based on corpora
that are increasingly representative of natural, spontaneous speech.
The growth in the use of speech corpora has bene®ted in the last 10 years from the establishment of data
centres, such as the Linguistic Data Consortium (LDC), the European Language Resources Association
(ELRA), the Japanese Language Resource Consortium (GSK: Gengo Shigen Kyouyuukikou), and multisite
annotation initiatives, such as the ToBI system for prosodic annotation and the DAMSL system of
discourse annotation. Today hundreds of annotated speech corpora exist and are used worldwide, and the
demand for richly annotated corpora is growing.http://www.corpus4u.org/upload/forum/2005070201070914.pdf
