parallel corpus通常是由双语或多语的对应语料构成,常常是翻译文本构成。例如:Babel English-Chinese Parallel Corpus (http://bowland-files.lancs.ac.uk/corplang/babel/babel.htm)。平行语料库常被用做对比和翻译研究之用。
balanced corpus主要是指其语料的取样上是均衡的,有代表性的。这种语料可以用作得出有关某种语言特性的一般性的结论。
例如:Lancaster Corpus of Mandarin Chinese
http://bowland-files.lancs.ac.uk/corplang/cgi-bin/conc.pl
以及
Academia Sinica Balanced Corpus of Modern Chinese
http://www.sinica.edu.tw/SinicaCorpus/
A parallel corpus is composed of texts which are translations of each other in different languages. Balance is associated with corpus representativeness.