Today's corpus linguistics: some open questions
The paper is concerned with problems of methodology. Against this
background, the situation of today’s corpora is discussed and some fields are
identified as being in a far from satisfactory shape. The place of corpora in
linguistics is briefly looked at, suggesting that structuralist tradition is the
only one to use them extensively. Problems of annotation and ways, less
(statistical) or more successful (rule-based), are raised and discussed. Here,
some of the most serious shortcomings, such as multi-word units or status of
language units in general that computational linguists should deal with, are
listed. In a more general direction, implications and status of paradigmatics
and syntagmatics are discussed, too, with considerable and critical attention
paid to ontologies.
http://bowland-files.lancs.ac.uk/corplang/data/openquestions.pdf
The paper is concerned with problems of methodology. Against this
background, the situation of today’s corpora is discussed and some fields are
identified as being in a far from satisfactory shape. The place of corpora in
linguistics is briefly looked at, suggesting that structuralist tradition is the
only one to use them extensively. Problems of annotation and ways, less
(statistical) or more successful (rule-based), are raised and discussed. Here,
some of the most serious shortcomings, such as multi-word units or status of
language units in general that computational linguists should deal with, are
listed. In a more general direction, implications and status of paradigmatics
and syntagmatics are discussed, too, with considerable and critical attention
paid to ontologies.
http://bowland-files.lancs.ac.uk/corplang/data/openquestions.pdf