Some free online Chinese corpora

xiaoz

永远的超级管理员
Staff member
Academia Sinica Balanced Corpus of Modern Chinese
http://www.sinica.edu.tw/SinicaCorpus/

Peking University Modern Chinese Corpus
http://ccl.pku.edu.cn/ccl_corpus/xiandaihanyu/

Xiamen University corpora (registration required but free)
http://xmuoec.com/gb/hanyu/hanyu/data/corpus/index.htm

Beijing Language and Culture University corpus
http://202.112.195.8

Lancaster Corpus of Mandarin Chinese
http://bowland-files.lancs.ac.uk/corplang/cgi-bin/conc.pl

Leeds Chinese corpus
http://corpus.leeds.ac.uk/query-zh.html

PFR People's Daily corpus (01/1998)
http://bowland-files.lancs.ac.uk/corplang/pdcorpus/pdcorpus.htm

PH corpus (Xinhua newswire data 1990-1991)
http://bowland-files.lancs.ac.uk/corplang/phcorpus/phcorpus.htm

People's Daily 2000 corpus
http://bowland-files.lancs.ac.uk/corplang/pdc2000/default.htm

Peking University Ancient Chinese Corpus
http://ccl.pku.edu.cn/ccl_corpus/jsearch/index.jsp?dir=gudai

Sinica corpus of early Chinese
http://www.sinica.edu.tw/Early_Mandarin/

Sheffield Corpus of Chinese for Diachronic Linguistic Study
http://www.shef.ac.uk/scc/
 
This is the most complete list of (freely available) Chinese corpora I have ever seen. Thanks a lot, Richard.
 
回复:Some free online Chinese corpora

以下是引用 清风出袖2005-6-27 19:42:51 的发言:
great! yet we are really need of a powerful concordancer for Chinese corpus, aren' we?


Have you tried Concordance by Rob Watt for the Windows PC plateform? Is that powerful enough?
 
that demo concordancer is very user-friendly, but doesn't support Chinese. and it is just a demo version. Right? Any good idea of getting a fully functional Chinese-compatible concordancer?
 
回复:Some free online Chinese corpora

以下是引用 xujiajin2005-7-1 9:41:43 的发言:
that demo concordancer is very user-friendly, but doesn't support Chinese. and it is just a demo version. Right? Any good idea of getting a fully functional Chinese-compatible concordancer?

The demo version IS the full version with the only restriction being that it is limited to one month's free use.

It does support Chinese and many other (Asian) languages since it is unicode based. Marjorie Chan has a user guide for working with Chinese texts with Concordance, which I believe was linked to somehwere on this site.
 
That post is pasted here for your reference:

Concordancers and Concordances: Tools for Chinese Language Teaching and Research

Marjorie K.M. Chan

In Journal of the Chinese Language Teachers Association, Volume 37:2. May 2002.

This paper presents an introduction to concordancers, and to the concordancing of Chinese e-texts in particular. Demonstrations are given of searches using spaced and non-spaced source e-texts, with the concordance results presented in Keyword-in-Context (KWIC) display format. There are illustrations to accompany discussions of full-text concordances, and of concordances targeting specific words or phrases. The writer suggests how concordancers might be used in language-teaching and in conducting research on various linguistic phenomena of the Chinese language. An appendix compares several concordancing programs capable of handling Chinese e-texts.

She is Chinese. Her Chinese name is 陈洁雯. Her home page:
http://people.cohums.ohio-state.edu/chan9/
 
回复: Some free online Chinese corpora

What kind of data are included in the online system?
 
回复: Some free online Chinese corpora

Academia Sinica Balanced Corpus of Modern Chinese
http://www.sinica.edu.tw/SinicaCorpus/

Peking University Modern Chinese Corpus
http://ccl.pku.edu.cn/ccl_corpus/xiandaihanyu/

Xiamen University corpora (registration required but free)
http://www.luweixmu.cn/home/html/Corpora/

Beijing Language and Culture University corpus
http://202.112.195.8:8089/ccir_login?input=*

Lancaster Corpus of Mandarin Chinese
http://bowland-files.lancs.ac.uk/corplang/cgi-bin/conc.pl

Leeds Chinese corpus
http://corpus.leeds.ac.uk/query-zh.html

PFR People's Daily corpus (01/1998)
http://bowland-files.lancs.ac.uk/corplang/pdcorpus/pdcorpus.htm

PH corpus (Xinhua newswire data 1990-1991)
http://bowland-files.lancs.ac.uk/corplang/phcorpus/phcorpus.htm

People's Daily 2000 corpus
http://bowland-files.lancs.ac.uk/corplang/pdc2000/default.htm

Peking University Ancient Chinese Corpus
http://ccl.pku.edu.cn/ccl_corpus/jsearch/index.jsp?dir=gudai

Sinica corpus of early Chinese
http://www.sinica.edu.tw/Early_Mandarin/

Sheffield Corpus of Chinese for Diachronic Linguistic Study
http://www.shef.ac.uk/scc/
 
Last edited:
回复: Some free online Chinese corpora

thank u so much! I need them so urgently for my opening thesis report!
 
Back
顶部