Representational issues in annotation Using the Australian map task corpus to relate prosody and discourse structure
Speech Communication 33 (2001) 113-134
Lesley Stirling a,*, Janet Fletcher a, Ilana Mushin a, Roger Wales b
Department of Linguistics and Applied Linguistics, University of Melbourne, Parkville 3010, Australia
LaTrobe University, Bundoora 3083, Australia
Abstract
This paper reports part of an ongoing investigation of the interaction of prosody and discourse structure. A digital
speech corpus (4 dialogues from the ANDOSL Australian map task corpus) was coded for prosodic structure (ToBI).
Independently, two dierent coding systems for dialogue micro-structure were applied to the same corpus: the HCRC
map task coding scheme (Carletta et al., 1996, 1997b) and the `Switchboard' version of the DRI/DAMSL scheme
(Jurafsky et al., 1997). We investigated whether silent pause location and duration, intonational boundaries associated
with Break Indices 3 and 4, as well as pitch range reset were signi®cantly correlated with dialogue act boundaries as has
been found for other varieties of English (e.g., Lehiste, 1975; Hirschberg and Nakatani, 1996; Silverman, 1987) and
Dutch (Swerts, 1997). The dialogue coding systems were systematically evaluated both against one another and in terms
of their correlation with the prosodic structure. The paper explores a number of methodological issues which arise in
eectively comparing and relating structures from dierent domains of analysis across a large speech corpus. It also
exempli®es the way in which annotated corpora can be used to evaluate theories and systems. Ó 2001 Elsevier Science
B.V. All rights reserved.
Keywords: Dialogue; Prosody; Map task; ToBI; Pitch; Pause; DAMSL; Dialogue acthttp://forum.corpus4u.org/upload/forum/2005070522150647.pdf
Speech Communication 33 (2001) 113-134
Lesley Stirling a,*, Janet Fletcher a, Ilana Mushin a, Roger Wales b
Department of Linguistics and Applied Linguistics, University of Melbourne, Parkville 3010, Australia
LaTrobe University, Bundoora 3083, Australia
Abstract
This paper reports part of an ongoing investigation of the interaction of prosody and discourse structure. A digital
speech corpus (4 dialogues from the ANDOSL Australian map task corpus) was coded for prosodic structure (ToBI).
Independently, two dierent coding systems for dialogue micro-structure were applied to the same corpus: the HCRC
map task coding scheme (Carletta et al., 1996, 1997b) and the `Switchboard' version of the DRI/DAMSL scheme
(Jurafsky et al., 1997). We investigated whether silent pause location and duration, intonational boundaries associated
with Break Indices 3 and 4, as well as pitch range reset were signi®cantly correlated with dialogue act boundaries as has
been found for other varieties of English (e.g., Lehiste, 1975; Hirschberg and Nakatani, 1996; Silverman, 1987) and
Dutch (Swerts, 1997). The dialogue coding systems were systematically evaluated both against one another and in terms
of their correlation with the prosodic structure. The paper explores a number of methodological issues which arise in
eectively comparing and relating structures from dierent domains of analysis across a large speech corpus. It also
exempli®es the way in which annotated corpora can be used to evaluate theories and systems. Ó 2001 Elsevier Science
B.V. All rights reserved.
Keywords: Dialogue; Prosody; Map task; ToBI; Pitch; Pause; DAMSL; Dialogue acthttp://forum.corpus4u.org/upload/forum/2005070522150647.pdf