为什么检索出来的结果里有L07 0930等等字母和数字
They are not nuisances. If you don't want them there, simply Find & Replace them to wipe them off before doing concordances with WordSmith. Here is how to Find & Replace them:
1. You need EditPlus to help you with this. You can get an evaluation version at: http://www.editplus.com.
2. Open the files (e.g. all the 15 LOB files) with EditPlus (you'd better backup your files first);
3. In the menu, click Search, Replace, type ^[a-z0-9]+[ ]+[0-9]+[ ] in Find what, and leave it empty for Replace with;
4. Check the option Regular expression and All open files, and click Replace all to get all the files ready for you to do your desired "nuisances-free" concordances.
The same applies to BROWN corpus.
Good luck!
^[a-z0-9]+[ ]+[0-9]+[ ]好像有点问题,结果不对啊
问个问题。为什么editplus不支持
[*],<*>, *不是代表所有的任意字符吗?而且*也是正则里的东东啊?
我的意思是我想去掉所有的
[*],<*>。如果编写公式??? 我写
[*],<*>,,发现无结果。
^[a-z0-9]+[ ]+[0-9]+[ ]好像有点问题,结果不对啊
请阅读 EditPlus 的帮助文件。
我觉得如果你的库中有大写字母的话 前面应该这样写,对不
^[a-zA-Z0-9]...........
laohong 在啊。我马上去阅读。但您能不能指导下如果去掉所有的[],<>这样的东西,公式如何编写?我昨天弄了老半天也没成功。谢谢了
而且发现brown的比较好处理,因为它的码都是出现在段落首!但如果象clec那样,在段落中间有码那??
其实很多检索软件都带了过滤功能,过滤后的结果就是没有码的,那样可以重新保存在txt吗?