Haiyang Ai
Administrator
This document provides some information about the creation of the corpus, along with results of the annotation effort. If you use the corpus in your research, we would appreciate your citing one or both of the following papers, which give some details of our work on paraphrase and our data annotation efforts. (A paper describing in detail how this corpus was created is currently in progress.) We are continuing to tag data, and hope to release a larger version of this corpus to the research community in the future.
Quirk, C., C. Brockett, and W. B. Dolan. 2004. Monolingual Machine Translation for Paraphrase Generation, In Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, Barcelona Spain.
Dolan W. B., C. Quirk, and C. Brockett. 2004. Unsupervised Construction of Large Paraphrase Corpora: Exploiting Massively Parallel News Sources. COLING 2004, Geneva, Switzerland.
Quirk, C., C. Brockett, and W. B. Dolan. 2004. Monolingual Machine Translation for Paraphrase Generation, In Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, Barcelona Spain.
Dolan W. B., C. Quirk, and C. Brockett. 2004. Unsupervised Construction of Large Paraphrase Corpora: Exploiting Massively Parallel News Sources. COLING 2004, Geneva, Switzerland.