http://193.133.140.102/JustTheWord/help.html
JustTheWord is a completely new kind of aid to help you with writing English.
If English is your first language, JustTheWord can help you express that elusive idea with le mot juste.
If you're learning English, JustTheWord can justify your choice of words or suggest improvements - and JustTheWord knows about the common errors made by speakers of your mother tongue (or will do, in the future).
When we write, we search our knowledge of words in two ways. We choose between words that mean similar things. A thesaurus gives us access to this sort of knowledge. But our choice constrains and is constrained by the other words in the sentence. We know, or need to know, which word combinations sound natural. A dictionary gives us access to some of this sort of knowledge.
Based on the latest advances in statistical linguistics, and exploiting Sharp's patented contextive technology, JustTheWord combines the advantages of thesaurus and dictionary, and enhances the usefulness of both.
By analysing a huge amount of English text, we've built up a highly detailed knowledge base of the word combinations whose mastery is at the heart of fluent English.
Type a word into the box and hit return or Show Combinations. JustTheWord will give you a detailed description of the company which that word keeps in modern-day English. To help you find your way to the information you need, in the right-hand frame you'll find the part(s)-of-speech and the types of relation that the word is found in. If you're looking for the right adjective to modify a noun you've chosen, click on the 'ADJ mod N*' link. If you want a verb with the noun as its object, follow the 'V obj N*' link. The star * marks your input, so you can tell the difference between for instance 'N* and N' and 'N and N*'. Within many types of relation you'll find the uses of the word clustered into groups with a similar meaning. The words that are not assigned to a cluster are grouped together at the end of the relation.
After each combination you'll find its frequency in our corpus, about 80,000,000 words of the BNC. The green bar by each combination gives a measure of how strong it is. Technically, the bar indicates the t-score. A larger t-score means that the combination occurs more often than you'd expect given the frequency of the parts. What you see is only those combinations whose frequency and t-score exceed certain thresholds. These vary dynamically as we try to show you something for every word, and not too much for the commoner words.
If you see an inflected form of a word in a combination (or a form preceded by '.'), this means that the combination includes this precise form. But if a word is in its dictionary form, it may need to be inflected. Nouns in their dictionary form might also need an article (and conversely, if an article is present, then the form of the noun will be as given). You can check the variability of a form, or any other properties of a combination, by clicking on the combination itself. This will show you some sentences from our corpus that have been analysed as containing examples of this combination.
If you want to find out longer combinations involving two words, type the two words into the box and choose 'Show Combinations'.
If you type several words into the box and hit return or 'Suggest Alternatives', JustTheWord will give you an idea of how well these words go together - a red bar indicates that the combination is unlikely (the longer, the unlikelier) - and some suggestions for improvements. For each word in the input, JustTheWord will try replacing it with a related word, and show you the strength of the combination in the usual way (green bars). The blue bars represent the similarity betwen the original word and its replacement. This feature is still under development, and you may find it a bit slow. Don't type too many words, but do include articles ('a', 'the', etc.) as this will help the system to get the right answer.
In the demo we've used two types of confusible. Choosing the 'Thesaurus' button will use only semantic proximity as a measure of confusability. Choosing 'Learner Errors' will use a corpus of actual learner errors of all types by speakers of various first languages. In the future, the learner errors category could be split up into different types of confusability relation, such as real-word spelling errors (phonological, orthographic and keyboarding confusables) and alternative translations in some context (bilingual confusables). Such sets could be further augmented with unattested but predictable confusables; and sub-classified into those typified by learners at certain levels or with certain L1's.
Please feel free to contact me, Pete Whitelock (pete@sharp.co.uk), if you want to ask a question, offer a suggestion, request a change, or just hurl gratuitous abuse.