
一/m(d-m) 、/w 旅行/v 背囊/n
列车/n 驶/Vg 抵/v 浜名湖/ns 铁桥/n 的/u(Dg-Ng-u) 时候/n ,/w 曾根二郎/nr 从/d(d-Ng-p-Vg) 靠近/v 车/v(n-Ng-q-v-V) 尾/Ng(Ng-q) 的/u(Dg-Ng-u) 三等/b 车/v(n-Ng-q-v-V) 一个/m 角落/n 里/f(f-Ng-q) 站/v(n-v) 起身/v 来/v(f-m-Ng-u-v-y) ,/w 准备/v 到/v(Ng-p-v) 餐车/n 去/v 。/w
AntConc 3.2.1w可以做到!我测试过了,没有问题的

[FONT=幼圆]AntConc 3.2.1w可以做到!我测试过了,没有问题的(汉字不乱码,标注跟在汉字的屁股后面)

(2)启动AntConc,点菜单上的“Global Settings”,弹出对话框,点左侧的"Language Encodings"的“Edit"按钮的"Unicode Encodings”的“Unicode(utf8)”;
(3)继续在这个对话框上,点左侧列表中的“Token [Word] Definitions”的“Punctuation”的左侧的小方框的勾勾,把这里勾上,点“Apply”就退出了当前的对话框;
(4)点菜单File的“Open File(s)”,来选中你准备处理的文件,从而可以看到左侧的文件列表中的乱码了的文件名;
(5)切换到主界面上的Word List这个子界面,点“Start”按钮


AntConc是一款免费的语料检索软件,就像中国杀毒软件界的“360”,以免费著称,将来也许功能可以赶上WordSmith Tools;而最著名最强大的WordSmith Tools语料处理工具则是商业版收费软件,单个用户购买需要50欧元呢,10个用户去团购的话就是250欧元,50个用户去团购的话就是500欧元。国内外尚没有WordSmith Tools的破解版。
回复: AntConc 3.2.1w可以做到!我测试过了,没有问题的

[FONT=幼圆]AntConc 3.2.1w可以做到!我测试过了,没有问题的(汉字不乱码,标注跟在汉字的屁股后面)[/FONT]​

[FONT=幼圆](2)启动AntConc,点菜单上的“Global Settings”,弹出对话框,点左侧的"Language Encodings"的“Edit"按钮的"Unicode Encodings”的“Unicode(utf8)”;[/FONT]
[FONT=幼圆](3)继续在这个对话框上,点左侧列表中的“Token [Word] Definitions”的“Punctuation”的左侧的小方框的勾勾,把这里勾上,点“Apply”就退出了当前的对话框;[/FONT]
[FONT=幼圆](4)点菜单File的“Open File(s)”,来选中你准备处理的文件,从而可以看到左侧的文件列表中的乱码了的文件名;[/FONT]
[FONT=幼圆](5)切换到主界面上的Word List这个子界面,点“Start”按钮[/FONT]


[FONT=幼圆]AntConc是一款免费的语料检索软件,就像中国杀毒软件界的“360”,以免费著称,将来也许功能可以赶上WordSmith Tools;而最著名最强大的WordSmith Tools语料处理工具则是商业版收费软件,单个用户购买需要50欧元呢,10个用户去团购的话就是250欧元,50个用户去团购的话就是500欧元。国内外尚没有WordSmith Tools的破解版。[/FONT]




VBB The present tense forms of the verb BE, except for is, 's: i.e. am, are, 'm, 're and be [subjunctive or imperative]
VBD The past tense forms of the verb BE: was and were
VBG The -ing form of the verb BE: being
VBI The infinitive form of the verb BE: be
VBN The past participle form of the verb BE: been
VBZ The -s form of the verb BE: is, 's
VDB The finite base form of the verb BE: do
VDD The past tense form of the verb DO: did
VDG The -ing form of the verb DO: doing
VDI The infinitive form of the verb DO: do
VDN The past participle form of the verb DO: done
VDZ The -s form of the verb DO: does, 's
VHB The finite base form of the verb HAVE: have, 've
VHD The past tense form of the verb HAVE: had, 'd
VHG The -ing form of the verb HAVE: having
VHI The infinitive form of the verb HAVE: have
VHN The past participle form of the verb HAVE: had
VHZ The -s form of the verb HAVE: has, 's
VM0 Modal auxiliary verb (e.g. will, would, can, could, 'll, 'd)
VVB The finite base form of lexical verbs (e.g. forget, send, live, return) [Including the imperative and present subjunctive]
VVD The past tense form of lexical verbs (e.g. forgot, sent, lived, returned)
VVG The -ing form of lexical verbs (e.g. forgetting, sending, living, returning)
VVI The infinitive form of lexical verbs (e.g. forget, send, live, return)
VVN The past participle form of lexical verbs (e.g. forgotten, sent, lived, returned)
VVZ The -s form of lexical verbs (e.g. forgets, sends, lives, returns)

NN0 Common noun, neutral for number (e.g. aircraft, data, committee) [N.B. Singular collective nouns such as committee and team are tagged NN0, on the grounds that they are capable of taking singular or plural agreement with the following verb: e.g. 'The committee disagrees/disagree'.]
NN1 Singular common noun (e.g. pencil, goose, time, revelation)
NN2 Plural common noun (e.g. pencils, geese, times, revelations)
NP0 Proper noun (e.g. London, Michael, Mars, IBM) [N.B. the distinction between singular and plural proper nouns is not indicated in the tagset, plural proper nouns being a comparative rarity.]

如果你所面临的标签系列也是这样动词标签有固定的V字母开头,或名词标签有固定的N开头之类的,你就直接检索/v和/n的数量了,具体做法是在AntConc中的Concordance这个窗体页面上的下方的“Search Term”的右侧的“Words”的前面的小勾勾,你取消它,也就是把你输入的“/v”和“/n”当作字符串来检索计算而不是当作单词来检索计算,你知道这里“当作单词”就是要在你输入的单词(串)的前后加上一个空格,所以,不能“当作单词”来检索。所以,你先后输入“/v”或“/n”就是,然后就点“Start”按钮了。
回复: 如果类似BNC这样有规律的动词或名词标签系列的话,就直接检索/v和/n的数量了



VBB The present tense forms of the verb BE, except for is, 's: i.e. am, are, 'm, 're and be [subjunctive or imperative]
VBD The past tense forms of the verb BE: was and were
VBG The -ing form of the verb BE: being
VBI The infinitive form of the verb BE: be
VBN The past participle form of the verb BE: been
VBZ The -s form of the verb BE: is, 's
VDB The finite base form of the verb BE: do
VDD The past tense form of the verb DO: did
VDG The -ing form of the verb DO: doing
VDI The infinitive form of the verb DO: do
VDN The past participle form of the verb DO: done
VDZ The -s form of the verb DO: does, 's
VHB The finite base form of the verb HAVE: have, 've
VHD The past tense form of the verb HAVE: had, 'd
VHG The -ing form of the verb HAVE: having
VHI The infinitive form of the verb HAVE: have
VHN The past participle form of the verb HAVE: had
VHZ The -s form of the verb HAVE: has, 's
VM0 Modal auxiliary verb (e.g. will, would, can, could, 'll, 'd)
VVB The finite base form of lexical verbs (e.g. forget, send, live, return) [Including the imperative and present subjunctive]
VVD The past tense form of lexical verbs (e.g. forgot, sent, lived, returned)
VVG The -ing form of lexical verbs (e.g. forgetting, sending, living, returning)
VVI The infinitive form of lexical verbs (e.g. forget, send, live, return)
VVN The past participle form of lexical verbs (e.g. forgotten, sent, lived, returned)
VVZ The -s form of lexical verbs (e.g. forgets, sends, lives, returns)

NN0 Common noun, neutral for number (e.g. aircraft, data, committee) [N.B. Singular collective nouns such as committee and team are tagged NN0, on the grounds that they are capable of taking singular or plural agreement with the following verb: e.g. 'The committee disagrees/disagree'.]
NN1 Singular common noun (e.g. pencil, goose, time, revelation)
NN2 Plural common noun (e.g. pencils, geese, times, revelations)
NP0 Proper noun (e.g. London, Michael, Mars, IBM) [N.B. the distinction between singular and plural proper nouns is not indicated in the tagset, plural proper nouns being a comparative rarity.]

如果你所面临的标签系列也是这样动词标签有固定的V字母开头,或名词标签有固定的N开头之类的,你就直接检索/v和/n的数量了,具体做法是在AntConc中的Concordance这个窗体页面上的下方的“Search Term”的右侧的“Words”的前面的小勾勾,你取消它,也就是把你输入的“/v”和“/n”当作字符串来检索计算而不是当作单词来检索计算,你知道这里“当作单词”就是要在你输入的单词(串)的前后加上一个空格,所以,不能“当作单词”来检索。所以,你先后输入“/v”或“/n”就是,然后就点“Start”按钮了。

/b[FONT=宋体]日常[/FONT]/b(b-d) [FONT=宋体]用品[/FONT]/n
/c [FONT=宋体]但[/FONT]/c(c-d-Ng) [FONT=宋体]无论[/FONT]/c
[FONT=宋体]d 刚[/FONT]/d(Ag-d-Ng) [FONT=宋体]要[/FONT]
[FONT=宋体]e 啊[/FONT]/e(e-y)
[FONT=宋体]f 通道[/FONT]/n [FONT=宋体]上[/FONT]/f(f-Ng-v)
h [FONT=宋体]眉头[/FONT]/n [FONT=宋体]微[/FONT]/h(Ag-h) [FONT=宋体]皱[/FONT]/n(n-v) [FONT=宋体]酩[/FONT]/x [FONT=宋体]酊[/FONT]/Ng [FONT=宋体]之[/FONT]/h(h-r-u-Vg) [FONT=宋体]路[/FONT]/n(n-Ng-q)
[FONT=宋体]i 悠然自得[/FONT]/i
[FONT=宋体]j 赴[/FONT]/v(Ng-v) [FONT=宋体]京[/FONT]/j(j-Mg-Ng) [FONT=宋体]自己[/FONT]/r [FONT=宋体]日[/FONT]/j(j-Ng-q) [FONT=宋体]后生[/FONT]/n [FONT=宋体]存[/FONT]/v
[FONT=宋体]k 之[/FONT]/r(h-r-u-Vg) [FONT=宋体]感[/FONT]/k(k-Ng-Vg)
[FONT=宋体]l 没关系[/FONT]/l [FONT=宋体]。[/FONT]
[FONT=宋体]m 一个[/FONT]/m [FONT=宋体]角落[/FONT]/n
[FONT=宋体]o 咕嘟[/FONT]/o [FONT=宋体]吧[/FONT]/o(o-V-y)
[FONT=宋体]p 在[/FONT]/p(d-p-v) [FONT=宋体]火车[/FONT]/n [FONT=宋体]上[/FONT]
[FONT=宋体]q 个[/FONT]/q [FONT=宋体]一[/FONT]/m(d-m) [FONT=宋体]声[/FONT]/q(n-q-Vg)
[FONT=宋体]r 您[/FONT]/r [FONT=宋体]下车[/FONT]/v [FONT=宋体]么[/FONT] [FONT=宋体]我[/FONT]/r [FONT=宋体]不[/FONT]/d [FONT=宋体]在[/FONT]
[FONT=宋体]s 身上[/FONT]/s [FONT=宋体]乡下[/FONT]/s
[FONT=宋体]t 明天[/FONT]/t
[FONT=宋体]u的[/FONT]/u(Dg-Ng-u [FONT=宋体]了[/FONT]/u(Dg-u-v-Vg-y)
[FONT=宋体]w ,。[/FONT]/w
[FONT=宋体]x唔[/FONT]/x [FONT=宋体]噢[/FONT]/e [FONT=宋体]。[/FONT]/w [FONT=宋体]”[/FONT]/w [FONT=MingLiU_HKSCS]?[/FONT]/x [FONT=宋体]吐[/FONT]/v [FONT=宋体]着[/FONT]/ [FONT=宋体]平[/FONT]/v(a-Ng-v) [FONT=宋体]氏[/FONT]/x
[FONT=宋体]y下车[/FONT]/v [FONT=宋体]么[/FONT]/y(Ng-y)
[FONT=宋体]z 零零碎碎[/FONT]/z [FONT=宋体]冷清清[/FONT]/z
回复: 如何用wordsmith4处理已经有词性赋码的中文标记语料库?急

楼主我和你有同样的需求哇,也是已赋码的汉语语料,但是用wordsmith4.0 怎么都不出结果。。之前用过antconc,但是wordsmith总觉得更精确一些。不知还有哪位高人有用wordsmith处理汉语语料的经验的,求分享啊。。
回复: 如何用wordsmith4处理已经有词性赋码的中文标记语料库?急


回复: AntConc 3.2.1w可以做到!我测试过了,没有问题的


回复: 谷歌翻译表明,意大利语是词与词之间有空格的,所以……

sono(n1) ∧ nsubj(n1, n2) ∧ #(n2) ∧ cop(n1, n3) ∧ ragazza(n3) ∧ det(n3, n4) ∧ una(n4) ∧ punct(n1, n5) ∧ .(n5)
还想请问一下,这个赋码用AntConc或者WordSmith能辨别吗?一般做法是在搜索某个词的时候一起输入赋码吗?比如sono (n1)^ nsubj(n1, n2) ∧ #(n2) ∧ cop(n1, n3) ∧。

谢谢,我还想计算出std,TTR ,用wordsmith怎么实现,这个的软件不注册是否可以实现这一功能,antconc可以吗?