回复:用AntConc处理中文concordance, wordlist, N-gram
以下是引用 laohong 在 2006-3-30 11:55:18 的发言:
以下是引用
动态语法 在
2006-3-29 15:28:58 的发言:
... I have had numerous discussions with him about code names; apparently this is the best that can be done at this point...
Basically, my test showed that this tiny program works very well with Chinese texts, though it is a pity that the concordances of KWIC are not nicely presented. Can you also ask him to add an option in saving the concordance result? Something similar as Wconcord's "Save with delimiters":
With the delimiters saved, the concordance result looks as follow:
Then we can make use of regular expression to replace all "|" with a Tab, and replace "[" with a Tab and "[". The result then can be opened with Excel in three columns. Resort in Excle is of course quite easy.
[/quote]
So my understanding is that you want some characters there in the result file to work with with
a GREP program and eventually be able to export the result to Excel. I asked him to make it
possible to center the search term in the line, which he said could be done easily. If this
happens I think it would work for your need. That is, if the search term is centered
there is usally a tab character before and after the search term, so you don't need the
| -> TAB replacement process. You could still use a GREP program to replace the sequence
'TAB SEARCH_TERM TAB' with whatever you want to replace and export
the data to whatever program you want to export. As far as I can tell, having the result in
a fixed format (e.g. TAB SEARCH_TERM TAB), a lot of things can be made possible.
(With regard to the [ ] characters, that's even easier to replace with any 'search and replace'
mechanisms.)
A little bit of history: the multilingual/UNICODE capability was added in v. 3.0. Now 3.1. is
vastly better than 3.0 but it's still a bit confusing as far as the encoding names.