Coming very soon from the Arts and Humanities Data Service, Oxford…
Developing Linguistic Corpora: A Guide to Good Practice
Edited by Martin Wynne, Oxford Text Archive
A selection of leading experts in various key areas of corpus construction offer advice in a readable and largely non-technical style to help the reader to ensure that their corpus is well designed and fit for the intended purpose.
As John Sinclair writes in the first chapter:
"A corpus is a remarkable thing, not so much because it is a collection of language text, but because of the properties that it acquires if it is well-designed and carefully-constructed."
This Guide is aimed at those who are at some stage of building a linguistic corpus. Little or no knowledge of corpus linguistics or computational procedures is assumed, although it is hoped that more advanced users will also find the guidelines here useful. It also has relevance for those who are not building a corpus, but who need to know something about the issues involved in the design of corpora in order to choose between available resources and to help draw conclusions from their analysis.
Contents
Chapter 1: Corpus and Text: Basic Principles (John Sinclair)
Chapter 2: Adding Linguistic Annotation (Geoffrey Leech)
Chapter 3: Metadata for Corpus Work (Lou Burnard)
Chapter 4: Character Encoding in Corpus Construction (Tony McEnery & Richard Xiao)
Chapter 5: Spoken Language Corpora (Paul Thompson)
Chapter 6: Archiving, Distribution and Preservation (Martin Wynne)
Available soon in print and online from the AHDS:
http://www.ahds.ac.uk/litlangling/
Developing Linguistic Corpora: A Guide to Good Practice
Edited by Martin Wynne, Oxford Text Archive
A selection of leading experts in various key areas of corpus construction offer advice in a readable and largely non-technical style to help the reader to ensure that their corpus is well designed and fit for the intended purpose.
As John Sinclair writes in the first chapter:
"A corpus is a remarkable thing, not so much because it is a collection of language text, but because of the properties that it acquires if it is well-designed and carefully-constructed."
This Guide is aimed at those who are at some stage of building a linguistic corpus. Little or no knowledge of corpus linguistics or computational procedures is assumed, although it is hoped that more advanced users will also find the guidelines here useful. It also has relevance for those who are not building a corpus, but who need to know something about the issues involved in the design of corpora in order to choose between available resources and to help draw conclusions from their analysis.
Contents
Chapter 1: Corpus and Text: Basic Principles (John Sinclair)
Chapter 2: Adding Linguistic Annotation (Geoffrey Leech)
Chapter 3: Metadata for Corpus Work (Lou Burnard)
Chapter 4: Character Encoding in Corpus Construction (Tony McEnery & Richard Xiao)
Chapter 5: Spoken Language Corpora (Paul Thompson)
Chapter 6: Archiving, Distribution and Preservation (Martin Wynne)
Available soon in print and online from the AHDS:
http://www.ahds.ac.uk/litlangling/