连续话语语料库的语音切分和标记
陈肖霞 中国社会科学院语言研究所
语言文字应用2000 年第2 期(总第34 期)
提要 对连续话语语料库进行切分和标记是一项新的课题,它对语料库的充分利用有重要作用,如何做好这项工作是一个值得探讨的问题。本文通过对一个语料库的切分和标记,得出了一些初步看法和认识,在这里跟同行们切磋,以使这项工作做得更完善。
A segmentation and labeling work based on continuous speech database
Chen Xiaoxia
Abstract segmentation and labeling for continuous speech database is important to the better use of database. The question of how to improve segmentation and labeling calls for further discussion. This paper shows the labeling and segmentation work we have done in standard Chinese. We have concluded with some labeling rules and segmentation units according to the database. We hope to got sayyestions and to do the work further.
http://forum.corpus4u.org/upload/forum/2005072820043991.pdf
陈肖霞 中国社会科学院语言研究所
语言文字应用2000 年第2 期(总第34 期)
提要 对连续话语语料库进行切分和标记是一项新的课题,它对语料库的充分利用有重要作用,如何做好这项工作是一个值得探讨的问题。本文通过对一个语料库的切分和标记,得出了一些初步看法和认识,在这里跟同行们切磋,以使这项工作做得更完善。
A segmentation and labeling work based on continuous speech database
Chen Xiaoxia
Abstract segmentation and labeling for continuous speech database is important to the better use of database. The question of how to improve segmentation and labeling calls for further discussion. This paper shows the labeling and segmentation work we have done in standard Chinese. We have concluded with some labeling rules and segmentation units according to the database. We hope to got sayyestions and to do the work further.
http://forum.corpus4u.org/upload/forum/2005072820043991.pdf