对Log likelihood的疑惑

Log likelihood的值除了和卡方值一样来说明某词的关键性之外,还有其他什么功能呢?我用WS作MI分析的时候,出现的统计值有MI, Z,MI3(也不懂为什么要整个MI3出来,和MI到底区别在哪儿),Log L和T。不知道这儿的Log L是基于哪些数据运算得来的?出现了Log L,而没有出现卡方值,那么它们的区别在哪里?
 
回复: 对Log likelihood的疑惑

In computing collocations, the MI score, like the z-score, gives too much weight to rare words. There is a way of rebalancing the MI score to address this problem by giving more weight to frequent words and less to infrequent words. The MI3 score was developed for just this purpose. MI3 achieves this effect by ‘cubing’ observed frequencies (cf. Oakes 1998: 171-172). The cubing of the frequencies gives a much bigger boost to high frequencies than low frequencies, thus achieving the desired effect.
 
回复: 对Log likelihood的疑惑

More related resources
Evert, Stefan (2008). Corpora and collocations. In A. Lüdeling and M. Kyt? (eds.), Corpus Linguistics. An International Handbook, article 58. Mouton de Gruyter, Berlin.
http://purl.org/stefan.evert/PUB/Evert2007HSK_extended_manuscript.pdf

Baroni, Marco and Evert, Stefan (2008). Statistical methods for corpus exploitation. In A. Lüdeling and M. Kyt? (eds.), Corpus Linguistics. An International Handbook, article 36. Mouton de Gruyter, Berlin.
http://purl.org/stefan.evert/PUB/BaroniEvertHSK38_manuscript.pdf

Collocation reading lists:
http://www.cambridge.org/assets/elt/nation/categorizedbibliography5.doc
http://juppiter.fltr.ucl.ac.be/fltr/germ/etan/bibs/corpling/COLLOC.TXT
 
回复: 对Log likelihood的疑惑

That's also helpful to me. By the way, where is the unit you posted from? Would you be nice enough to offer us the other units of that book, if possible? Many thanks!
 
Back
顶部