回复:Excel制作的MI, MI3, T-score, Z-score计算工具
Nice tools. Thanks for sharing.
Some things to consider. 1) Using the fomulas for T score and Z-core (?) based on
“导论”may not be the best choices. For example, the way“导论”calculates the p.
value of the collocate word, as indicated below,
搭配词概率(probability of the collocate )
BNCweb 公式中:搭配词概率 = 搭配词频数 / (整个文本长度 - 节点词频数)
《导论》公式中:搭配词概率 = 搭配词频数 / 整个文本长度
seems to be flawed. The p value should calculate its likelyhood to appear with other
words than the node, not simply everything in the corpus. That's why BNCWeb has
(整个文本长度 - 节点词频数). IMHO treating p(c), as《导论》公式 does, as
搭配词概率 = 搭配词频数 / 整个文本长度 doesn't really have a sound logic behind it.
2. A similar statement could be made about 《导论》's choice of the window span. A
number such as 4 doesn't really constitute a 'window'; it's only half of it.
3. You didn't specify which formulas you used, so my comments may be off the mark.
But this information can be useful for the user when comparing results and clearing
up confusions.
But none of this should in any way decrease the value of this nice Excel
implementation of the formulas.