


We have just released an alternative interface for the COCA data -- the first completely new website for the corpus in four years. Unlike this standard COCA interface, everything at the new website is frequency-based. Users can browse through the entire frequency listing (words 1-60,000). And then for any word that they select, they can see the definition and collocates and concordance lines and synonyms and WordNet entries -- all on one screen, with extensive links from one word to another.

In addition, in the next month or two we'll be releasing two more related resources. The first will allow you to input a text (e.g. a newspaper article or a paper that you've written) and then it will analyze the text by frequency and suggest other alternatives for words and phrases (based on COCA data). The second resource is a special version of www.wordandphrase.info -- oriented to English for Academic Purposes (EAP) and based on the 85 million words of academic texts in COCA.

回复: Davies教授推出基于COCA语料库的词频信息查询网站


We have just released an important new interface for the Corpus of Contemporary American English (COCA):


Even more so than the standard COCA interface (which will continue to be available), the new website is designed to provide information on nearly everything that you might want to know about a word and its usage -- all on one screen. Users can look for specific words or browse through the entire frequency listing (words 1-60,000). And then for any matching words, they can see:

-- the definition(s) of the word
-- the overall frequency in the 425 million word corpus, and its rank (1-60,000)
-- the frequency in each of the five main genres -- spoken, fiction, magazines, newspapers, and academic
-- 20-30 collocates (nearby words), which provide useful insight into meaning and usage
-- 200 concordance lines (re-sortable), which provide insight into the patterns in which the word occurs
-- synonyms (grouped by meaning and sorted by frequency); can click to see the entries for related words
-- WordNet entries, showing related words with a more specific or a more general meaning

As noted, all of this information is displayed together on one screen, with extensive links from one word to another (which allow to to compare words in many useful ways). If you are interested in English words, their frequency, their meaning, the relationship to related words, and the patterns in which a word occurs, we believe that this new resource will be invaluable for you in your teaching, learning, and research. And as always, it is available for free.

Finally, we might note that in the next month or two we'll be releasing two more related resources. The first will allow you to input a text (e.g. a newspaper article or a paper that you've written) and then it will analyze the text by frequency and suggest other alternatives for highlighted words and phrases (based on COCA data). The second resource is a special version of www.wordandphrase.info -- oriented to English for Academic Purposes (EAP) and based on the 85 million words of academic texts in COCA. We'll let you know about these as they become available.


Mark Davies
Brigham Young University
回复: Davies教授推出基于COCA语料库的词频信息查询网站

或在网站上直接提交有问题的词(点击BAD ENTRY)
回复: Davies教授推出基于COCA语料库的词频信息查询网站


We've added a new feature at www.wordandphrase.info -- the alternative interface for COCA. You can now input an entire text -- maybe a newspaper article that you've copied from a website, or something you've written -- and it will then give you detailed information about the words and phrases in the text. There's now no need to copy and paste individual words and phrases into the regular COCA interface -- just work seamlessly from your original text.

First, it will highlight all of the medium and lower-frequency words in your text (based on frequency data from COCA), and create lists of these words that you can use offline. This frequency data can help language learners focus on new words, and it can allow you to see "what the text is about" (i.e. text-specific words). You can also have it show you the "academic" words in your text (again, based on COCA data).

Second, you can click on any word in your text to get detailed information about the word (all on one screen) -- its overall frequency in COCA, its frequency in each genre (spoken, fiction, magazine, newspaper, and academic), the 20-30 most frequent collocates (nearby words), up to 200 sample concordance lines, synonyms, and related words from WordNet. There's no need to go consult other dictionaries or thesauruses or online-resources -- it's all right there, with just one click for each and every word in your text.

Finally, you can also see detailed information about phrases in your text. Just click on a phrase in the text, and it will show you related phrases from COCA. For example, if you're writing a paper and have used the phrase potent argument, you could click on that phrase and then have it suggest related phrases based on COCA data -- in this case, where there is a synonym of potent followed by argument. For example, it would list strong / persuasive / convincing argument (all of which are more common in COCA). It will show you the frequency of each phrase in COCA and you can click on any of these to see them in context in the corpus. In this way, it serves as a sort of "grammatical thesaurus" to find just the right phrase in English.

All of this is now available at http://www.wordandphrase.info/, along with the features that were there before, including the ability to browse through and search a huge frequency dictionary of English and see detailed information about any word. If you are interested in English words and phrases, their meaning, their frequency, and their distribution in different genres, we believe that this will be an exciting new resource. And as with all of our corpora, it is available for free.


Mark Davies
Brigham Young University


但我认为最牛的一句话是“There's no need to go consult other dictionaries or thesauruses or online-resources -- it's all right there, with just one click for each and every word in your text.”

并且将该数据网站定义为一个"grammatical thesaurus"。
