[求助] How to ignore all the tags in CLEC?

hancunxin

Moderator
when you view CLEC text, you may find two kinds of tags. they are error tags (error annotation?)which are included in [*]and headers and something else which are included in <*>. Then how to ignore both the two tags when viewing?
can i do like the below or what else i can do to solve my prblem?
2005090309335529.jpg
 
回复:[求助] How to ignore all the tags in CLEC?

以下是引用 xiaoz2005-9-3 9:56:17 的发言:
Try
<*>, [*]
(with a comma to separate them)



your advice didn't work.
 
ok, that's what WordSmith says. I haven't tried that. What I normally use is to suppress one tag by using "Ignore tag" as in your screenshot. This methos can also be used to suppress the LOB style POS tags (by typing in _* instead of <*>). To suppress a second tag (this one must have an open and end tag), use "Only part of the file". To suppress tags in both <> and [].

1) actviate Ignore tag and <*>
2) click on Part of the file and type in [ in the open tag and ] in the end tag.
Bingo.
 
回复:[求助] How to ignore all the tags in CLEC?

Backup the text files and delet all the tags using word or Powergrep and you will get a clean text copy.
 
回复:[求助] How to ignore all the tags in CLEC?

以下是引用 wzli2005-9-4 20:58:56 的发言:
Backup the text files and delet all the tags using word or Powergrep and you will get a clean text copy.

Thank you very much! But how to delete all the tags using word? would you please specify ?
 
回复:[求助] How to ignore all the tags in CLEC?

以下是引用 wzli2005-9-4 20:58:56 的发言:
Backup the text files and delet all the tags using word or Powergrep and you will get a clean text copy.

Do you really mean that no corpus tools available now can ignore the tags in CLEC. If so, do you think it is a problem of the tagging system employed by the CLEC?
 
WordSmith 3 and Monoconc can of course be used to search tags in CLEC.
 
回复:[求助] How to ignore all the tags in CLEC?

以下是引用 xiaoz2005-9-4 22:31:29 的发言:
WordSmith 3 and Monoconc can of course be used to search tags in CLEC.


xiaoz, i guess xusuan575 concern how to ignore the tags instead of searching tags. that is only my humble view.
 
In reply to No. 10 -
Right. Have you succeeded in suppressing <*> and [*] in CLEC using the combination of "Ignore tag" and "Part of file" in WordSmith?
 
回复:[求助] How to ignore all the tags in CLEC?

编辑->替换->(选择‘使用通配符’)->在‘查找内容’中键入:
\<*\>
然后单击‘全部替换’,然后再键入:
\[*\]
单击‘全部替换’。

用不用CLEC中的附码,取决于个人的研究需要。但是没有任何插入码的干净文本总是很有用的,研究者既可以自己附码,也可以做其它分析。

另外,如果需要把每篇文本分开,使之单独成为文件,可使用wordsmith的文本分割器,标记好每个文件的头尾就行。CLEC库文件中没有明确标识文件的结尾,但是下一篇的开头既是上一篇的结尾,只需找好位置插入相应的符号就可以了。CLEC没有声称按照XML格式编码,而是采取cocoa编码系统,所以没有关闭符也不算是个缺陷。如果需要,转换成HTML或XML格式也是很方便的事。
 
回复:[求助] How to ignore all the tags in CLEC?

以下是引用 wzli2005-9-5 5:26:39 的发言:
编辑->替换->(选择‘使用通配符’)->在‘查找内容’中键入:
\<*\>
然后单击‘全部替换’,然后再键入:
\[*\]
单击‘全部替换’。

用不用CLEC中的附码,取决于个人的研究需要。但是没有任何插入码的干净文本总是很有用的,研究者既可以自己附码,也可以做其它分析。

另外,如果需要把每篇文本分开,使之单独成为文件,可使用wordsmith的文本分割器,标记好每个文件的头尾就行。CLEC库文件中没有明确标识文件的结尾,但是下一篇的开头既是上一篇的结尾,只需找好位置插入相应的符号就可以了。CLEC没有声称按照XML格式编码,而是采取cocoa编码系统,所以没有关闭符也不算是个缺陷。如果需要,转换成HTML或XML格式也是很方便的事。


Thank you very much, Dr Li. i make it.
 
回复:[求助] How to ignore all the tags in CLEC?

以下是引用 wzli2005-9-5 5:26:39 的发言:
编辑->替换->(选择‘使用通配符’)->在‘查找内容’中键入:
\<*\>
然后单击‘全部替换’,然后再键入:
\[*\]
单击‘全部替换’。

用不用CLEC中的附码,取决于个人的研究需要。但是没有任何插入码的干净文本总是很有用的,研究者既可以自己附码,也可以做其它分析。

另外,如果需要把每篇文本分开,使之单独成为文件,可使用wordsmith的文本分割器,标记好每个文件的头尾就行。CLEC库文件中没有明确标识文件的结尾,但是下一篇的开头既是上一篇的结尾,只需找好位置插入相应的符号就可以了。CLEC没有声称按照XML格式编码,而是采取cocoa编码系统,所以没有关闭符也不算是个缺陷。如果需要,转换成HTML或XML格式也是很方便的事。


Dr. Li, I have another question to ask. would you please tell me how to remove some tags and remain what i want.? For example, i want to remain all the [vp6]tags and remove other tags.
 
回复:[求助] How to ignore all the tags in CLEC?

In that case, you'll have to use wordsmith and make a tag file for yourself. Please consult the help file for detailed instruction. And there are other ways...
1. Work with word and replace the tags with a new unique and markup, delete the unwanted ones, and retain what you want.
2. Work with powergrep and pick up the wanted tags and replace them with your own tags and delete all the rest.
 
回复:[求助] How to ignore all the tags in CLEC?

以下是引用 wzli2005-9-5 19:43:09 的发言:
In that case, you'll have to use wordsmith and make a tag file for yourself. Please consult the help file for detailed instruction. And there are other ways...
1. Work with word and replace the tags with a new unique and markup, delete the unwanted ones, and retain what you want.
2. Work with powergrep and pick up the wanted tags and replace them with your own tags and delete all the rest.


got it! thank you very much!
 
回复:[求助] How to ignore all the tags in CLEC?

以下是引用 xiaoz2005-9-4 22:55:53 的发言:
In reply to No. 10 -
Right. Have you succeeded in suppressing <*> and [*] in CLEC using the combination of "Ignore tag" and "Part of file" in WordSmith?

i failed to find "part of file" in wordsmith3.0 . i only found " only part of file". when i push that button, i don't know where to type in those words.
have a look
2005090609151037.jpg




2005090609153063.jpg



[本贴已被 作者 于 2005年09月06日 09时16分48秒 编辑过]
 
In your second screenshot,

below "Sections to cut",

for "starting with, replace "start of file" with [

for "ending with", insert ]

check the box for Activated in the first instance

Leave the right column (or) and "Sections to use" blank.
 
Back
顶部