hancunxin 在 qq群上提出了Factor Analysis 与 Cluster Analysis 区别的问题
我找到了如下论述,大致可以区分两者的区别
Good luck!
0. http://www.siu.edu/~epse1/pohlmann/factglos/
factor analysis; a statistical technique used to (1) estimate factors or latent variables, or (2) reduce the dimensionality of a large number of variables to a fewer number of factors.
cluster analysis; a collection of statistical techniques for creating homogeneous groups of cases or variables. Clusters are formed using distance functions. The elements in a cluster have relatively small distances from each other and relatively larger distances from elements outside of a cluster.
1. http://www.answers.com/topic/cluster-analysis
Cluster analysis is the obverse of factor analysis. Whereas factor analysis reduces the number of variables by grouping them into a smaller set of factors, cluster analysis reduces the number of observations or cases by grouping them into a smaller set of clusters.
2. http://comp9.psych.cornell.edu/Darlington/factor.htm
Factor Analysis Versus Clustering and Multidimensional Scaling
While factor analysis is typically applied to a correlation matrix, those other methods can be applied to any sort of matrix of similarity measures, such as ratings of the similarity of faces. But unlike factor analysis, those methods cannot cope with certain unique properties of correlation matrices, such as reflections of variables. For instance, if you reflect or reverse the scoring direction of a measure of "introversion", so that high scores indicate "extroversion" instead of introversion, then you reverse the signs of all that variable's correlations: -.36 becomes +.36, +.42 becomes -.42, and so on. Such reflections would completely change the output of a cluster analysis or multidimensional scaling, while factor analysis would recognize the reflections for what they are; the reflections would change the signs of the "factor loadings" of any reflected variables, but would not change anything else in the factor analysis output.
Another advantage of factor analysis over these other methods is that factor analysis can recognize certain properties of correlations. For instance, if variables A and B each correlate .7 with variable C, and correlate .49 with each other, factor analysis can recognize that A and B correlate zero when C is held constant because .72 = .49. Multidimensional scaling and cluster analysis have no ability to recognize such relationships, since the correlations are treated merely as generic "similarity measures" rather than as correlations.
We are not saying these other methods should never be applied to correlation matrices; sometimes they yield insights not available through factor analysis. But they have definitely not made factor analysis obsolete. The next section touches on this point.
...
I don't mean to imply that you should always try to make every variable load highly on only one factor. For instance, a test of ability to deal with arithmetic word problems might well load highly on both verbal and mathematical factors. This is actually one of the advantages of factor analysis over cluster analysis, since you cannot put the same variable in two different clusters.
3. http://www.qmethod.org/Issues/cluster_vs_Q.htm
On a more practical level, “Factor analysis has an underlying theoretical model, while cluster analysis is more ad hoc” (SPSS Manual, 1999, p.293).
我找到了如下论述,大致可以区分两者的区别
Good luck!
0. http://www.siu.edu/~epse1/pohlmann/factglos/
factor analysis; a statistical technique used to (1) estimate factors or latent variables, or (2) reduce the dimensionality of a large number of variables to a fewer number of factors.
cluster analysis; a collection of statistical techniques for creating homogeneous groups of cases or variables. Clusters are formed using distance functions. The elements in a cluster have relatively small distances from each other and relatively larger distances from elements outside of a cluster.
1. http://www.answers.com/topic/cluster-analysis
Cluster analysis is the obverse of factor analysis. Whereas factor analysis reduces the number of variables by grouping them into a smaller set of factors, cluster analysis reduces the number of observations or cases by grouping them into a smaller set of clusters.
2. http://comp9.psych.cornell.edu/Darlington/factor.htm
Factor Analysis Versus Clustering and Multidimensional Scaling
While factor analysis is typically applied to a correlation matrix, those other methods can be applied to any sort of matrix of similarity measures, such as ratings of the similarity of faces. But unlike factor analysis, those methods cannot cope with certain unique properties of correlation matrices, such as reflections of variables. For instance, if you reflect or reverse the scoring direction of a measure of "introversion", so that high scores indicate "extroversion" instead of introversion, then you reverse the signs of all that variable's correlations: -.36 becomes +.36, +.42 becomes -.42, and so on. Such reflections would completely change the output of a cluster analysis or multidimensional scaling, while factor analysis would recognize the reflections for what they are; the reflections would change the signs of the "factor loadings" of any reflected variables, but would not change anything else in the factor analysis output.
Another advantage of factor analysis over these other methods is that factor analysis can recognize certain properties of correlations. For instance, if variables A and B each correlate .7 with variable C, and correlate .49 with each other, factor analysis can recognize that A and B correlate zero when C is held constant because .72 = .49. Multidimensional scaling and cluster analysis have no ability to recognize such relationships, since the correlations are treated merely as generic "similarity measures" rather than as correlations.
We are not saying these other methods should never be applied to correlation matrices; sometimes they yield insights not available through factor analysis. But they have definitely not made factor analysis obsolete. The next section touches on this point.
...
I don't mean to imply that you should always try to make every variable load highly on only one factor. For instance, a test of ability to deal with arithmetic word problems might well load highly on both verbal and mathematical factors. This is actually one of the advantages of factor analysis over cluster analysis, since you cannot put the same variable in two different clusters.
3. http://www.qmethod.org/Issues/cluster_vs_Q.htm
On a more practical level, “Factor analysis has an underlying theoretical model, while cluster analysis is more ad hoc” (SPSS Manual, 1999, p.293).