Evolution and present situation of corpus research in China

laohong · 2006-11-06

Evolution and present situation of corpus research in China

Zhiwei Feng

Institute of Applied Linguistics, China

International Journal of Corpus Linguistics 11:2 (2006), 73–207.
issn 1384–6655 / e-issn 1569–9811 ? John Benjamins Publishing Company.

Abstract:

In this paper, the author introduces in detail the development and present situation of corpus linguistics in China: earlier corpora, large-scale & authentic text corpora, national corpora, speech corpora, bilingual corpora and corpora of minority languages in China. The various processing techniques for corpora are also introduced: automatic word segmentation of Chinese text, automatic PoS tagging, automatic tagging of phrase structure and automatic alignment of bilingual corpora. This paper is a bird’s-eye view of corpus linguistics of China. Finally, the author discusses several problems in present corpus research: standardization of corpus specifications, commonly sharing of language resources, knowledge properties, etc.

Keywords:

corpus; large-scale & authentic text; speech corpora; bilingual corpora; corpora of minority languages in China; automatic word
segmentation; automatic PoS tagging; automatic tagging of phrase structure; automatic alignment of bilingual corpora.

xujiajin · 2006-11-07

回复: Evolution and present situation of corpus research in China

Prof. Feng doesn't have good knowledge about the research done by Foreign Language circles in China. So his review concerns more on research done by computational linguists working with Chinese data.

Evolution and present situation of corpus research in China

laohong

管理员

xujiajin

管理员