"Brown" corpus of Bulgarian 保加利亚语布朗家族语料库
http://dcl.bas.bg/Corpus/copyright_en.html
Features of the Bulgarian corpus.
Each corpus sample (corpus unit, text sample) is an excerpt(s) from a text (texts) which length is fixed at 2 000 words with the precise number of words varying, as the adopted methodology envisages keeping sentence boundaries. The term 'corpus sample' and its synonyms are used to refer to that part of any textual matter included in the corpus. The "Brown" Corpus of Bulgarian consists of 500 corpus samples and totals to 1 001 286 words. Despite the intention to make samples 2 000+ words, 136 samples contain less than 2 000 words.
http://dcl.bas.bg/Corpus/copyright_en.html
Features of the Bulgarian corpus.
Each corpus sample (corpus unit, text sample) is an excerpt(s) from a text (texts) which length is fixed at 2 000 words with the precise number of words varying, as the adopted methodology envisages keeping sentence boundaries. The term 'corpus sample' and its synonyms are used to refer to that part of any textual matter included in the corpus. The "Brown" Corpus of Bulgarian consists of 500 corpus samples and totals to 1 001 286 words. Despite the intention to make samples 2 000+ words, 136 samples contain less than 2 000 words.