What"s wrong with my ParaConc? 汉字显示为乱码

patric emailed me with the following questions -

the build of my paraconc is 269, i am not sure if it is the latest edition. i pay much attention to the fonts: Chinese GB2312, 宋体. i switch the code into Uft-8, but fails at last.

Build 269 is the latest stable release published, though a demo version of 270 (with a hit limit of 150) is available out there.

The problem of displaying Chinese is caused by incorrect way of loading corpus. Suppose your Chinese data has been saved as or converted into UTF-8, which is supported in 269.

If you load your corpus this way:
2006030522394749.jpg


you will results like this:
2006030522402147.jpg


and if you load your corpus this way:
2006030522405743.jpg


you will get results like this:
2006030522413588.jpg
 
i dont understand why the language column in the right is choosed as English instead of Chinese, but we load a Chinese txt in fact, is that the problem?
 
No, the right column for language should be Chinese. The important thing is that the files in utf must be highlighted so that the cjeck box for utf8 can be selected.
 
UTF-8 can only be highlighted when corpus files are added and selected. If you are sure that the files are text files, then add them and select them using your mouse. Then check the box foe UTF-8 to see what will happen.
 
i am sure i have already followed what you said, the problem is that when i add the text in UTF, it pops up the window that i uploaded before.
 
回复:What"s wrong with my ParaConc? 汉字显示为乱码

不知道如何才能让Paracon上面文本显示框内的内容和下面文本显示框内为一一对应的句子,将多余上下文过滤掉?
2006030621242818.jpg
 
这个估计是需要标注的啊。我也不知道这个paraconc对文本的要求到底需要达到什么程度啊。而且在检索过程中有很多的重复,不知道如何克服?
2006030621371578.jpg



[本贴已被 作者 于 2006年03月06日 21时37分18秒 编辑过]
 
There are repetitions because the search words (of in this case) is repeated in the English.
 
Oscar3: Try Display - Context and select line (as long as one sentence per line in your corpus) to see the effect.
 
回复:What"s wrong with my ParaConc? 汉字显示为乱码

Dr.xiao, i am using your babel, and how can i remove these tags when displaying?

以下是引用 xiaoz2005-8-25 0:22:07 的发言:
This version of Paraconc does not have a way to remove POS tags of the underscore format. But the new release can.

以下是引用 patricx2005-8-25 0:08:46 的发言:
 
回复:What"s wrong with my ParaConc? 汉字显示为乱码

it seems that i failed to find your botten in my paraconc
2006030622173020.jpg
 
In File - Tag settings - Normal tags, define tag start as < and tag end as >
In File - Tag settings - Speicial tags, check the box for Embedded in word and define the character in word as _
After the search, Go to Display - Suppress to select both ormal tags and Special tags.
 
回复:What"s wrong with my ParaConc? 汉字显示为乱码

以下是引用 xiaoz2006-3-6 22:00:29 的发言:
Oscar3: Try Display - Context and select line (as long as one sentence per line in your corpus) to see the effect.

Xiaoz: I choosed DIsplay-Context-sentence or segment and got the following display rsult
2006030623020253.jpg




This is the display result when you choose lines
2006030623043648.jpg



[本贴已被 作者 于 2006年03月06日 23时04分38秒 编辑过]
 
succeed in doing this, great thanks to Dr.xiao!!!
2006030623270559.jpg



[本贴已被 作者 于 2006年03月06日 23时27分07秒 编辑过]
 
Back
顶部