Download: Twitter tokenizer and POS tagger (version 0.3)

xujiajin

管理员
Staff member
Download: Twitter tokenizer and POS tagger (version 0.3)

http://code.google.com/p/ark-tweet-nlp/downloads/detail?name=ark-tweet-nlp-0.3.tgz

We're pleased to announce a new release of the CMU ARK Twitter Part-of-Speech
Tagger, version 0.3.

* The new version is much faster (40x) and more accurate (89.2 -> 92.8) than
before.

* We also have released new POS-annotated data, including a dataset of one
tweet for each of 547 days.

* We have made available large-scale word clusters from unlabeled Twitter data
(217k words, 56m tweets, 847m tokens).

Tools, data, and a new technical report describing the release are available at:
http://www.ark.cs.cmu.edu/TweetNLP/

http://www.ark.cs.cmu.edu/TweetNLP/paths/0100100.html
a
http://www.ark.cs.cmu.edu/TweetNLP/paths/1111100101110.html
http://www.ark.cs.cmu.edu/TweetNLP/paths/111100000011.html ,

Brendan O'Connor

--
PhD Student, Machine Learning Department
School of Computer Science, Carnegie Mellon University
http://brenocon.com
 
回复: Download: Twitter tokenizer and POS tagger (version 0.3)

Seems to be tailored to the new social media, quite interesting, thanks.
 
Back
顶部