[转帖]new corpora modelled on LOB and FLOB

tiger

高级会员
We are nearing completion of a corpus of printed texts produced in 1931 (+/- 3 years), and have begun compiling a similar corpus of texts produced in 1901 (+/- 3 years).
Both corpora are modelled on the LOB and FLOB corpora of British English, sampling 1961 and 1991 respectively.
We expect to release the 1931 corpus next year, after clearing copyright permissions.
http://www.comp.lancs.ac.uk/ucrel/projects.html#prelob

Geoff Leech, Nick Smith, Paul Rayson
Lancaster University.
 
回复: [转帖]new corpora modelled on LOB and FLOB

Yes. Earlier this year Paul Rayson told me about the prelob corpus when we met at CASS. He said they had finished collecting the data. I think this is an interesting design especially for dichronic studies when we've got data across 90 years.

I prefer to call this corpus BOB or BLOB, standing for Before LOB.

*******

Leverhulme Corpus Project
Investigator: Geoffrey Leech Research Associate: Nick Smith (half-time)
This project is supported by the Leverhulme Trust under its Emeritus Fellowship Scheme. The research projects runs for 15 months from October 2003. The plan is to build a corpus which matches as closely as possible the LOB and FLOB corpora of written British English, except that the year of data collection is 1931, or near to that date (+/- 3 years). The immediate purpose of building this corpus is to make it possible to compare these three temporally equidistant corpora (1931, 1961, 1991): "Pre-LOB", LOB, and FLOB. This will enable us to track grammatical change through a period of 60 years of the 20th century. In previous projects on recent grammatical change in English funded by the AHRB and the British Academy, we have be able to observe some notable trends through the differences between corpora of the 1960s and the 1990s, such as declining frequency of the modal auxiliaries (especially shall, must, ought to and may) and a growing frequency of semi-modals such as have to, need to, and want to. By projecting this comparison back to the beginning of the 1930s, we will be able to confirm that these trends are a continuation of earlier changes. The early decades of the 20th century are virtually unrepresented in corpora of English, and so the planned new corpus will fill an important empirical gap in our historical knowledge of the language. The new corpus under construction is as yet unnamed.
 
Back
顶部