回复: Biber的多维度分析=数据挖掘&文本聚类分析?
MD analysis examines a large number of observable linguistic features for a small number of unobserval, underlying constructs or factors. The factors are called dimensions becasue each represent a continuous scale. Just as, in physical check-ups, different scales measure different aspects of the patients' phyical condition , for example, his blood pressure, pulse, weight, body temperature, etc., the dimesions here measure differenct aspects of language use in texts. Each dimension usually has two complementary sets of linguistic features that co-occur in texts. The presence ( in varying degrees) of the positive loading features means the absence ( in varying degrees) of the negative loading features or the other way around.
Central to MD analysis is the statistical procedure of factor analysis. Ideally, the dimensions extracted should explain a large portion of the total shared variance. However, neither Biber himself nor his followers have been successful in this regard. In fact, few MD studies have yielded dimesions that can account for more than 50% of the variance. For example, Kanoksilapatham (2007)'s 7 dimensions account for only 33.5% of the total varience.
I have problem with MD anaysis chiefly because I don't see how the results can be put to pratical use. I'm not satisfied with those implifications discussed in general terms. It seems to me that many MD studies were carried out just for fun.