[下载]kfNgram

xiaoz

永远的超级管理员
Staff member
Flectcher's another powerful tool to extract n-grams:

kfNgram is a free stand-alone Windows program for linguistic research which generates lists of n-grams in text and HTML files. Here n-gram is understood as a sequence of either n words, where n can be any positive integer, also known as lexical bundles, chains, wordgrams, and, in WordSmith, clusters, or else of n characters, also known as chargrams. When not further specified here, n-gram refers to wordgrams. kfNgram also produces and displays lists of "phrase-frames", i.e. groups of wordgrams identical but for a single word.

http://miniappolis.com/KWiCFinder/kfNgramHelp.html#Download
 
An n-gram is a string of n words. "according to" is a 2-gram; "you can read" is a 3-gram...
 
回复: [下载]kfNgram

今天看来,kfngram没有什么特别之处,几乎所有的concordancer都可以做ngram。
但kfngram的一个小功能是很值得关注的,就是phrase frame。
Phrase-Frames

To help the user discover additional linguistic patterns, kfNgram can produce lists of "phrase-frames", i.e. wordgrams which are identical except for a single word, as in the following example from the BNC written texts:

as * as the 4566 5
as well as the 2674
as far as the 874
as soon as the 652
as long as the 316
as much as the 50

The first line in each group shows the phrase-frame, with wildcard * standing for the word that differs in the variants. The second column in this line gives the total frequency of all variants, and the third column indicates the number of variants the phrase-frame has. Sets of phrase-frames and their variants are separated by a double set of carriage-return / line-feed pairs. While phrase-frame files are initially shown in the n-gram viewer window for quick verification, they are best studied with the phrase-frame browser (Tools menu / Ctrl-B). The phrase-frame feature was added at the suggestion of Prof. Michael Stubbs of the University of Trier and is based on a concept first developed by his graduate student Isabel Barth.
 
Back
顶部