Representational issues in annotation


Staff member
Representational issues in annotation Using the Australian map task corpus to relate prosody and discourse structure

Speech Communication 33 (2001) 113-134

Lesley Stirling a,*, Janet Fletcher a, Ilana Mushin a, Roger Wales b
Department of Linguistics and Applied Linguistics, University of Melbourne, Parkville 3010, Australia
LaTrobe University, Bundoora 3083, Australia

This paper reports part of an ongoing investigation of the interaction of prosody and discourse structure. A digital
speech corpus (4 dialogues from the ANDOSL Australian map task corpus) was coded for prosodic structure (ToBI).
Independently, two di€erent coding systems for dialogue micro-structure were applied to the same corpus: the HCRC
map task coding scheme (Carletta et al., 1996, 1997b) and the `Switchboard' version of the DRI/DAMSL scheme
(Jurafsky et al., 1997). We investigated whether silent pause location and duration, intonational boundaries associated
with Break Indices 3 and 4, as well as pitch range reset were signi®cantly correlated with dialogue act boundaries as has
been found for other varieties of English (e.g., Lehiste, 1975; Hirschberg and Nakatani, 1996; Silverman, 1987) and
Dutch (Swerts, 1997). The dialogue coding systems were systematically evaluated both against one another and in terms
of their correlation with the prosodic structure. The paper explores a number of methodological issues which arise in
e€ectively comparing and relating structures from di€erent domains of analysis across a large speech corpus. It also
exempli®es the way in which annotated corpora can be used to evaluate theories and systems. Ó 2001 Elsevier Science
B.V. All rights reserved.
Keywords: Dialogue; Prosody; Map task; ToBI; Pitch; Pause; DAMSL; Dialogue act