Segmenting and labeling continuous speech

xujiajin

管理员
Staff member
连续话语语料库的语音切分和标记
陈肖霞  中国社会科学院语言研究所

语言文字应用2000 年第2 期(总第34 期)

提要 对连续话语语料库进行切分和标记是一项新的课题,它对语料库的充分利用有重要作用,如何做好这项工作是一个值得探讨的问题。本文通过对一个语料库的切分和标记,得出了一些初步看法和认识,在这里跟同行们切磋,以使这项工作做得更完善。
A segmentation and labeling work based on continuous speech database
Chen Xiaoxia
Abstract  segmentation and labeling for continuous speech database is important to the better use of database. The question of how to improve segmentation and labeling calls for further discussion. This paper shows the labeling and segmentation work we have done in standard Chinese. We have concluded with some labeling rules and segmentation units according to the database. We hope to got sayyestions and to do the work further.

http://forum.corpus4u.org/upload/forum/2005072820043991.pdf
 
陈肖霞的这项研究对口语中的一些现象,比如,韵律单位同语法单位之间的关系等。一个比较直接的应用是用在用在语音合成和识别技术的开发。可以说我们生活中的很多方面都用到相关的技术,只是我们不太留意而已,比如金山词霸,adobe acrobat, Encarta encyclopedia中就有朗读功能就运用了语音合成的tts(text to speech)。我们平时用卡打电话中电脑operator,在比如查询余额的功能都运用了语音合成功能。
To name just a few.
 
Back
顶部