ACE语料库中是否标注了指代关系?

如题,MUC的预料中是明确标注了指代关系的,但在ACE语料库中似乎并没有发现,是不是标注的格式有不一样,请各位大侠赐教!谢!
 
回复: ACE语料库中是否标注了指代关系?

ACE is not annotated anaphorically.

There are a number of corpora that are annotated for coreference, e.g. the Lancaster/IBM anaphoric treebank (containing 100,000 words), a 65,000-word corpus resulting from the MUC (Message Understanding Conference) coreference task (Hirschman 1997), a 60,000-word corpus produced at the University of Wolverhampton (Mitkov et al 2000), and 93,931 words of the Penn Treebank (Ge 1998). A much larger corpus annotated for coreference (one million words) is under construction on a project undertaken by the University of Stendahl and Xerox Research Centre Europe (Tutin et al 2000).
 
回复: ACE语料库中是否标注了指代关系?

One of the components of ACE is about Pronoun Coreference. Here is the list of ACE corpora available in LDC:

> LDC2006E54 ACE 2007 Training
> LDC2006E47 REFLEX-MTE DevTest
> LDC2003T11 ACE 2 English Training Corpus
> LDC2004T09 ACE 2003 Multilingual Training Corpus
> LDC2004E38 ACE 2003 Evaluation Corpus
> LDC2005T09 ACE 2004 Multilingual Training Corpus
> LDC2004E51 ACE 2004 Evaluation Corpus
> LDC2004E39 ACE 2004 English Consistency Analysis - Training Data
> LDC2004E40 ACE 2004 English Consistency Analysis - Eval Data
> LDC2005T07 ACE 2004 Time Normalization (TERN) English Training Data
> LDC2006T06 ACE 2005 Multilingual Training Corpus
> LDC2005E23 ACE 2005 Evaluation Corpus
> LDC2005E22 ACE 2005 Arabic Unsupervised Training Data
> LDC2005E21 ACE 2005 Chinese Unsupervised Training Data
> LDC2005E20 ACE 2005 English Unsupervised Training Data
> LDC2005E25 ACE 2005 Consistency Analysis - Training Data
> LDC2002E50 Name-Annotated TDT Corpus Supplement for ACE
> LDC2005T33 BBN Pronoun Coreference and Entity Type Corpus

For more about LDC2005T33, look at:
http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2005T33
 
回复: ACE语料库中是否标注了指代关系?

You may also want to have a look at MEDCo project of GENIA corpus, another coreference corpus, which I designed years ago. The coreference types are slightly different from those of MUC6 & 7 as we worked on MEDLINE biomedical articles. BTW, MUC texts are newsware articles from WSJ (Wall Street Jorunals). See the coreference annotation examples of MEDCo corpus at:

http://nlp.i2r.a-star.edu.sg/medco/showmedco.html?id=91079577

http://nlp.i2r.a-star.edu.sg/medco/showmedco.html?id=91094881

http://nlp.i2r.a-star.edu.sg/medco/showmedco.html?id=91101115

http://nlp.i2r.a-star.edu.sg/medco/showmedco.html?id=91110562
 
Last edited:
回复: ACE语料库中是否标注了指代关系?

I was thinking of another ACE corpus - the Australian Corpus of English, also known as the Macquarie corpus.
 
Back
顶部