==================================
Wacky! Working papers on the Web as Corpus
==================================
This book collects articles deriving from presentations at two Web as Corpus workshops (held in Forlì and Birmingham in 2005) and articles that were born out of discussions and collaborative experimentation among the WaCky community members. WaCky (for "Web as Corpus kool ynitiative") brings together linguists who think the World Wide Web is a great resource for their research, and that it would be even greater if it could be annotated and interrogated in a more linguist-friendly way.
Topics covered in this book include practical experiences with the construction and evaluation of Web corpora, methods to classify and represent Web corpora, and applications to terminology. The introduction provides an accessible account of the various steps and issues involved in building very large Web corpora and making them available to the linguistic community. English, Chinese and Japanese are among the studied languages.
Web corpora are undoubtedly a timely and important topic for the corpus/computational linguistics community. This book is unique in that it provides detailed technical discussion of the issues related to constructing Web corpora, as well as examples of concrete applications to terminology practice and teaching. As such, it should be of interest to a wide audience of linguists, language technologists, language/translation teachers and language professionals.
============================
How to quote this book:
============================
Baroni, Marco and Bernardini, Silvia (eds.) 2006. Wacky! Working papers on the Web as Corpus. Bologna: GEDIT. [ISBN 88-6027-004-9]
============================
Contents
============================
Front Matter
(Includes author contact information)
A WaCky Introduction
Silvia Bernardini, Marco Baroni and Stefan Evert
Experience Building a Large Corpus for Chinese Lexicon Construction
Thomas Emerson and John O'Neil
Creating General-Purpose Corpora Using Automated Search Engine Queries
Serge Sharoff
Evaluation of Japanese Web-Based Reference Corpora: Effects of Seed Selection and Time Interval
Motoko Ueyama
Measuring Web Corpus Randomness: A Progress Report
Massimiliano Ciaramita and Marco Baroni
Using the Web as a Source of LSP Corpora in the Terminology Classroom
Sara Castagnoli
Specialized Corpora from the Web and Term Extraction for Simultaneous Interpreters
Claudio Fantinuoli
The Net for the Graphs: Towards Webgenre Representation for Corpus Linguistic Studies
Alexander Mehler and Rüdiger Gleim
=======================================
Download the whole book from:
http://wackybook.sslmit.unibo.it/pdfs/wackybook.zip
=======================================
Wacky! Working papers on the Web as Corpus
==================================
This book collects articles deriving from presentations at two Web as Corpus workshops (held in Forlì and Birmingham in 2005) and articles that were born out of discussions and collaborative experimentation among the WaCky community members. WaCky (for "Web as Corpus kool ynitiative") brings together linguists who think the World Wide Web is a great resource for their research, and that it would be even greater if it could be annotated and interrogated in a more linguist-friendly way.
Topics covered in this book include practical experiences with the construction and evaluation of Web corpora, methods to classify and represent Web corpora, and applications to terminology. The introduction provides an accessible account of the various steps and issues involved in building very large Web corpora and making them available to the linguistic community. English, Chinese and Japanese are among the studied languages.
Web corpora are undoubtedly a timely and important topic for the corpus/computational linguistics community. This book is unique in that it provides detailed technical discussion of the issues related to constructing Web corpora, as well as examples of concrete applications to terminology practice and teaching. As such, it should be of interest to a wide audience of linguists, language technologists, language/translation teachers and language professionals.
============================
How to quote this book:
============================
Baroni, Marco and Bernardini, Silvia (eds.) 2006. Wacky! Working papers on the Web as Corpus. Bologna: GEDIT. [ISBN 88-6027-004-9]
============================
Contents
============================
Front Matter
(Includes author contact information)
A WaCky Introduction
Silvia Bernardini, Marco Baroni and Stefan Evert
Experience Building a Large Corpus for Chinese Lexicon Construction
Thomas Emerson and John O'Neil
Creating General-Purpose Corpora Using Automated Search Engine Queries
Serge Sharoff
Evaluation of Japanese Web-Based Reference Corpora: Effects of Seed Selection and Time Interval
Motoko Ueyama
Measuring Web Corpus Randomness: A Progress Report
Massimiliano Ciaramita and Marco Baroni
Using the Web as a Source of LSP Corpora in the Terminology Classroom
Sara Castagnoli
Specialized Corpora from the Web and Term Extraction for Simultaneous Interpreters
Claudio Fantinuoli
The Net for the Graphs: Towards Webgenre Representation for Corpus Linguistic Studies
Alexander Mehler and Rüdiger Gleim
=======================================
Download the whole book from:
http://wackybook.sslmit.unibo.it/pdfs/wackybook.zip
=======================================