回复:[求助]lexical density tools needed
You used ACWT properly but over estimated its capabilities. It doesn't do the
automatic calculation of the numbers of function words and content words.
(I am not aware of any lexical tools that do this automatically, and what counts
as function words/content words has to be decided by the researcher.)
That being said, it doesn't seem to be a terribly hard thing to do to figure out
the number of what you believe to be content/function words in your corpus.
Here is some suggestion:
1) Use an English/Chinese POS tagger to tag your corpus first;
2) Use a program to search/calculate the frequencies of the tags (not words)
of the function words in your definition;
3) Use the Ure/Stubbs method in ACWT to calculate the LD value.
The reason for searching function word tags in step 2) is that function words
tend to be a more limited set than content word classes. But you could do either
content or function classes and use the total corpus size to figure out the size
of the other class.
以下是引用 valeriazuo 在 2005-9-19 21:17:10 的发言:
Mr.xiao, thanks for your link. what a pity! I couldn't operate it well. I opened a text file and applied a tool - Calculate LD(a la Ure/Stubbs) to it but there was no expected outcome. A dialog box popuped, which asked me to fill the number of content words and corpus size. In fact, I want it to count the number of content words and the size by itself but how can I order the tool to do this job? Thanks a lot for your kind advice.
You used ACWT properly but over estimated its capabilities. It doesn't do the
automatic calculation of the numbers of function words and content words.
(I am not aware of any lexical tools that do this automatically, and what counts
as function words/content words has to be decided by the researcher.)
That being said, it doesn't seem to be a terribly hard thing to do to figure out
the number of what you believe to be content/function words in your corpus.
Here is some suggestion:
1) Use an English/Chinese POS tagger to tag your corpus first;
2) Use a program to search/calculate the frequencies of the tags (not words)
of the function words in your definition;
3) Use the Ure/Stubbs method in ACWT to calculate the LD value.
The reason for searching function word tags in step 2) is that function words
tend to be a more limited set than content word classes. But you could do either
content or function classes and use the total corpus size to figure out the size
of the other class.