[求助]colsec的赋码原则？

xieang_007 · 2005-11-06

我有语料库，但是没有赋码的说明，请问哪位同仁有，可不可以给我分享一下

xiaoz · 2005-11-06

Some descriptions here:
http://www.corpus4u.com/upload/forum/2005072921580052.rar

ineedgerf · 2005-11-07

There are two sets of codes: one is the coding of learners' errors and the other is the POS. The POS part is based on TOSCA/LOB tagset and tagged by this tagger. For the tagset, please visit http://english.htu.edu.cn/lingualsoft/index.htm and there is the COLEN corpus I did about three years ago.

[本贴已被作者于 2005年11月07日 13时00分22秒编辑过]

xieang_007 · 2005-11-07

Thanks

xieang_007 · 2005-11-07

但是还有一部分是表示修正语和打断等等，这些码的Tagset就不得而知了啊

ineedgerf · 2005-11-09

这些码只有一个格式，没有“码集”，因为错误类型很多，在处理原来的标注格式时，我也一并将原来的标注格式转换为了和其它码句法相同的XML格式。

[求助]colsec的赋码原则？

xieang_007

初级会员

xiaoz

永远的超级管理员

ineedgerf

普通会员

xieang_007

初级会员

xieang_007

初级会员

ineedgerf

普通会员