to 动态语法: I use BNC as the control corpus for my comparative study. Since it's monstously large, I only selected 500 instances randomly out of around 17700 occurrences of the word under discussion. But all these are processed by SARA, the typicality of the radomly selected concordances are questionable. So far, I haven't found a way to copy the corcondance lines from SARA to a word file. I'm sorry, I can't provide the raw data. The software I use to calculate Z-score requires 5 pieces of information, viz. C1 节点词与搭配词共现次数, C2 搭配词的出现频数, S 默认为10, Cs 语料库总词容, n 节点词出现频数. It's a simple software, actually, to save human laborious efforts.
to xiaoz and 动态语法: I'm using BNC world edition released in 2000. Let's take 'everyone' for example: the collocate word 'else' co-occurs 1153 times with 'everyone'; 'everyone' and 'else' occur 12786 and 19931 times respectively; S is the window span, we set it as 10 (5 left, 5 right); and the total number of words of BNC is 100,000,000. The Z-score given by BNC is 237.1, while putting all these data into the Z-score software CalcZ, the result is 150.3. Now you can see this is my question.
Besides, if I choose 500 concordances randomly by BNC, and 'else' co-occurs 36 times with 'everyone', then I calculate Z-score within downloads only, the number I get is 37.3. Why it is such a far cry from the one, 237.1, in the whole corpus?
以下是引用 ibid 在 2005-10-9 14:19:56 的发言:
Another question, does anyone know how to save concordance lines in a word file? Now I can only save the lines in xml format, then copy to a word file. But there are a lot of tags I need to get rid of.
it depends on what kind of concordancer you are using.