词性赋码语料库的检索与正则表达式的编写 POS tagged corpus search + regex

回复: 词性赋码语料库的检索与正则表达式的编写 POS tagged corpus search + regex

Many thanks. That's the very stuff I'm searching for.
 
回复: 词性赋码语料库的检索与正则表达式的编写 POS tagged corpus search + regex

I have no words to express my thanks, so kind of you !
 
回复: 词性赋码语料库的检索与正则表达式的编写 POS tagged corpus search + regex

Thanks for sharing!
 
回复: 词性赋码语料库的检索与正则表达式的编写 POS tagged corpus search + regex

谢谢许博士分享!
 
回复: 词性赋码语料库的检索与正则表达式的编写 POS tagged corpus search + regex

感谢分享……
 
回覆: 词性赋码语料库的检索与正则表达式的编写 POS tagged corpus search + regex

i'm just a beginner and i'm rather concerned about what kind of automatic pos tagging tools can be used before we use the PatternBuilder. Although the paper 词性赋码语料库的检索与正则表达式的编写 mentioned two tagging tools, one is CLAWS4 which is not free, the other is treetagger, which do not share the tagset and cannot work together with PatternBuilder. And if no automatic pos tagging tool is available, then PatternBuilder works little.
 
回复: 词性赋码语料库的检索与正则表达式的编写 POS tagged corpus search + regex

PatternBuilder works with both CLAWS C7 tagset and TreeTagger tagset, when different ini or template files are called upon.

CLAWS online trial service allows 10,000 words tagging sercive from an educational site.
http://ucrel.lancs.ac.uk/claws/trial.html
 
回复: 词性赋码语料库的检索与正则表达式的编写 POS tagged corpus search + regex

Thank you for your response, prof. Xu!
However, I still wonder what you mean by 'different ini'. Please forgive my ignorance. I mean tagset.ini was created automatically when I first double-clicked the PatternBuilder.exe and how can I get a 'different ini'?
To be more exact, I think Treetagger2 cannot work with PatternBuilder because the regular expression I get with the help of PatternBuilder simply cannot be applied to a text tagged by Treetagger2. Say, '\S+_PPH1\s', the regular expression for the third person singular pronoun 'it' in the PatternBuilder, cannot be found in the text tagged by Treetagger2, which has a different tagset and doesn't include '_PPH1'
Look forward to your reply! Thank you!
 
词性赋码语料库的检索与正则表达式的编写

我已用[FONT=宋体]ICTCLAS2011[/FONT]对汉语语料进行了分词和词性标注,其中语气词标注为/y 。
请问用WordSmith 5.0 索引时如何能检索出所有除“了”之外的语气词?谢谢!
 
Back
顶部