请问如何使用WordSmith 5.0的ConGram?

Warning: I wrote this for my students about 8 months ago based on the then current version of WST5. I haven't used it for a while and have not tried it on any updated versions of WST5. If there is anything out of date in it, please let me know.

-------

WordSmith Tools 5: WSConcGram User Guide – Hongyin Tao, UCLA 1/30/08

“For years it has been easy to search for or identify consecutive clusters (n-grams) such as AT THE END OF, MERRY CHRISTMAS or TERM TIME. It has also been possible to find non-consecutive linkages such as STRONG within the horizons of TEA by adapting searches to find context words. The concgram procedure takes a whole corpus of text and finds all sorts of combinations like the ones above, whether consecutive or not.” - WS Tools

Computing ConcGram with Chinese Texts

1) Convert your text into UNICODE (UTF16), with word boundaries (skip this if texts are in English).

2) Run WS WordList, select Index (bottom): Make/Add to Index, change, save, and remember the location and the name of the index file.

3) WS Main Menu, Utilities, select WSConcGram

4) Under WSConcGram, File, New, Getting Started, confirm the Index file (usually the last one just generated with WordList), run Build (step 1), upon finishing it, return to the same window, run Build (step 2).

5) Again, under WSConcGram, File, Open, confirm index file, Show.

6) To look for ConcGrams, browse or type on "Word …" to search for specific items. Check or uncheck "As Tree" on top right of the display window to manipulate the display format.

Some concrete example ...
 
Warning: I wrote this for my students about 8 months ago based on the then current version of WST5. I haven't used it for a while and have not tried it on any updated versions of WST5. If there is anything out of date in it, please let me know.

-------

WordSmith Tools 5: WSConcGram User Guide – Hongyin Tao, UCLA 1/30/08

“For years it has been easy to search for or identify consecutive clusters (n-grams) such as AT THE END OF, MERRY CHRISTMAS or TERM TIME. It has also been possible to find non-consecutive linkages such as STRONG within the horizons of TEA by adapting searches to find context words. The concgram procedure takes a whole corpus of text and finds all sorts of combinations like the ones above, whether consecutive or not.” - WS Tools

Computing ConcGram with Chinese Texts

1) Convert your text into UNICODE (UTF16), with word boundaries (skip this if texts are in English).

2) Run WS WordList, select Index (bottom): Make/Add to Index, change, save, and remember the location and the name of the index file.

3) WS Main Menu, Utilities, select WSConcGram

4) Under WSConcGram, File, New, Getting Started, confirm the Index file (usually the last one just generated with WordList), run Build (step 1), upon finishing it, return to the same window, run Build (step 2).

5) Again, under WSConcGram, File, Open, confirm index file, Show.

6) To look for ConcGrams, browse or type on "Word …" to search for specific items. Check or uncheck "As Tree" on top right of the display window to manipulate the display format.

Some concrete example ...

thanks a lot,Dr.Tao.
 
Back
顶部