[求助]colsec的赋码原则?

Some descriptions here:
http://www.corpus4u.com/upload/forum/2005072921580052.rar
 
There are two sets of codes: one is the coding of learners' errors and the other is the POS. The POS part is based on TOSCA/LOB tagset and tagged by this tagger. For the tagset, please visit http://english.htu.edu.cn/lingualsoft/index.htm and there is the COLEN corpus I did about three years ago.

[本贴已被 作者 于 2005年11月07日 13时00分22秒 编辑过]
 
这些码只有一个格式,没有“码集”,因为错误类型很多,在处理原来的标注格式时,我也一并将原来的标注格式转换为了和其它码句法相同的XML格式。
 
Back
顶部