下面corpora list上的讨论,说得非常好。转帖如下:
On 2/5/2011 5:14 AM, Andrea Nini wrote:
> 1) is it possible to calculate the factor scores of a new text using
> the factors that Biber used for his study?
Absolutely, but see below.
> 2) would it affect the results the fact that my texts have to be
> normalised to 100 words whereas Biber's texts were normalised to 1000
> words?
In principle yes, but see below.
> 3) when calculating the factor scores for my texts, what means should
> I consider? The ones taken from my dataset or the ones taken from
> Biber's study?
The ones from your dataset, absolutely, but...
Multidimensional analysis is exciting, but there are significant
problems. The main one that I found is that Biber did not use
per-choice frequencies, so the co-occurrences he identified could have
been due to grammar. In fact, you could interpret his Dimension 1 as
simply "nouns vs. verbs" and Dimension 2 as "past vs. present." I tried
to use the envelope of variation to counteract this, but I was not
successful. I discussed this more in a paper at the 2005 AACL:
http://www.grieve-smith.com/Academic/AAACL-grvsmth.060225.pdf
--
-Angus B. Grieve-Smith
Saint John's University
grvsmth@panix.com
On 2/5/2011 5:14 AM, Andrea Nini wrote:
> 1) is it possible to calculate the factor scores of a new text using
> the factors that Biber used for his study?
Absolutely, but see below.
> 2) would it affect the results the fact that my texts have to be
> normalised to 100 words whereas Biber's texts were normalised to 1000
> words?
In principle yes, but see below.
> 3) when calculating the factor scores for my texts, what means should
> I consider? The ones taken from my dataset or the ones taken from
> Biber's study?
The ones from your dataset, absolutely, but...
Multidimensional analysis is exciting, but there are significant
problems. The main one that I found is that Biber did not use
per-choice frequencies, so the co-occurrences he identified could have
been due to grammar. In fact, you could interpret his Dimension 1 as
simply "nouns vs. verbs" and Dimension 2 as "past vs. present." I tried
to use the envelope of variation to counteract this, but I was not
successful. I discussed this more in a paper at the 2005 AACL:
http://www.grieve-smith.com/Academic/AAACL-grvsmth.060225.pdf
--
-Angus B. Grieve-Smith
Saint John's University
grvsmth@panix.com