“Systematic” means that the structure and contents of the corpus follows certain
extralinguistic principles (“sampling principles”, i.e. principles on the basis of which the texts included were chosen). For example, a corpus is often restricted to certain text types, to one or several varieties of English, and to a certain time span. If several subcategories (e.g. several text types, varieties etc.) are represented in a corpus, these are often represented by the same amount of text. “Systematic” also means that information on the exact composition of the corpus is available to the researcher (including the number of words in each category and in
the whole corpus, how the texts included in the corpus were sampled etc).
Although “corpus” can refer to any systematic text collection, it is commonly used in a
narrower sense today, and is often only used to refer to systematic text collections that have been computerized