回复: Treetagger3.0赋码时如何保持原语料的分段(或分行)形式?
It's true you will have the problem with TreeTagger for Windows 3.0 Lite (English tagging only). If you do need to retain original paragraphs, you may download V2.0 (multilingual) at Baidu yunpan.
回复: 请问哪里可以找到Business English Corpus
I happened to find this Business English Corpus which was compiled by a team within main land of China: http://biz.yulk.org/
回复: word, or 纯文本
建设语料库的过程中,有一个处理文本的过程。在在输入,清理杂质,标注等等步骤时,什么工具方便就用什么工具。比如,手工输入时,MS-word显示和拼写检查等都比较友好,使用word比较好,但是要做词性标注或者是句法标注时,可能就需要txt格式了,有时候可能还需要用到MS-Excel来处理。语料库最终的保存形式则要看语料库使用的大小,是长期保存还是一次性用途,是自己用还是打算以后共享给他人的因素再作决定。
回复: 求助:WordSmith 6关于关键主题词的疑问
I don't think "overall frequency" is a term with much technical sense. It most probably refers to the total hits of each token.
2nd International Conference on
Cognitive Research on Translation and Interpreting
Date: 5-6 November 2015
Venue: University of Macau
The Centre for Studies of Translation, Interpreting and Cognition (CSTIC) is pleased to announce the Call for Paper for the 2nd...