Annotating Discourse Connectives in the Chinese Treebank
Nianwen Xue
Department of Computer and Information Science
University of Pennsylvania
xueniwen@linc.cis.upenn.edu
http://www.cis.upenn.edu/~xueniwen/publications/cdtb.pdf
Abstract
In this paper we examine the issues that arise from the annotation of the discourse connectives for the Chinese Discourse Treebank Project. This project is based on the same principles as the PDTB, a project that annotates the English discourse connectives in the Penn Treebank. The paper
begins by outlining range of discourse connectives under consideration in this project and examines the distribution of the explicit discourse connectives. We then examine the types of syntactic units that can be arguments to the discourse connectives. We show that one of the
most challenging issues in this type of discourse annotation is determining the textual spans of the arguments and this is partly due to the hierarchical nature of discourse relations. Finally, we discuss sense discrimination of the discourse connectives, which involves separating discourse
connective from non-discourse connective senses and teasing apart the different discourse connective senses, and discourse
connective variation, the use of different connectives to represent the same discourse relation.
Nianwen Xue
Department of Computer and Information Science
University of Pennsylvania
xueniwen@linc.cis.upenn.edu
http://www.cis.upenn.edu/~xueniwen/publications/cdtb.pdf
Abstract
In this paper we examine the issues that arise from the annotation of the discourse connectives for the Chinese Discourse Treebank Project. This project is based on the same principles as the PDTB, a project that annotates the English discourse connectives in the Penn Treebank. The paper
begins by outlining range of discourse connectives under consideration in this project and examines the distribution of the explicit discourse connectives. We then examine the types of syntactic units that can be arguments to the discourse connectives. We show that one of the
most challenging issues in this type of discourse annotation is determining the textual spans of the arguments and this is partly due to the hierarchical nature of discourse relations. Finally, we discuss sense discrimination of the discourse connectives, which involves separating discourse
connective from non-discourse connective senses and teasing apart the different discourse connective senses, and discourse
connective variation, the use of different connectives to represent the same discourse relation.