Cambridge International Corpus

xujiajin

管理员
Staff member
http://www.cambridge.org/elt/corpus/cic.htm

The Cambridge International Corpus (CIC) is a very large collection of English texts, stored in a computerised database, which can be searched to see how English is used. It has been built up by Cambridge University Press over the last ten years to help in writing books for learners of English. The English in the CIC comes from newspapers, best-selling novels, non-fiction books on a wide range of topics, websites, magazines, junk mail, TV and radio programmes, recordings of people's everyday conversations and many other sources.
 
目前700 million词了,并且逐年增加,应该是Monitor Corpus吧。
里面还包括Learner corpus:
19 million Learners' written English (the Cambridge Learner Corpus)
8 million Error coded learner written English
只可惜不公开。
 
Back
顶部