Lexical coverage of spoken discourse

Thanks for the article Richard.

Maybe it's a good topic to do a similar research on Chinese lexical coverage of spoken discource.

What do you think?
 
什么是口语词汇,这一直是一个问题?我曾经利用London Lund Corpus和Spoken English Corpus生成两个词表,结果发现排在前面的词汇与书面语里的词汇的次序差不太多。
 
Agree with Ocean that this is a great topic.
Afraid wordlists are not a solution in this case. In all registers and genres, function words such as the and of will sit on top of the frequency lists. The key lies in keywords and key keywords (WordSmith).

[本贴已被 作者 于 2005年06月21日 17时43分12秒 编辑过]
 
回复:Lexical coverage of spoken discourse

Afraid wordlists are not a solution in this case. In all registers and genres, function words such as the and of will sit on top of the frequency lists. The key lies in keywords
^^^^^^^
and key keywords (WordSmith).
^^^^^^^^^^^^

How such (key) key words be worked out in addition to the stop list of function words?
 
A stop list might help in this repsect. In theory a combination of stop list and (key) keywords is possible.

Load the stop list (Adjust settings - Lemma, match and stop lists) before making a wordlist for the resarch corpus and the reference corpus. Then extract (key) keywords as usual.
 
Back
顶部