Xaira FAQs

xiaoz · 2005-08-06

1. Q: Xaira client fails to open corpus with message "Incorrect parameter". What should I do?
A: If the message incorrect parameter appears when you try to open a corpus, the most probable explanation is that Xaira is unable to access its working directory. This is a folder which Xaira uses to store temporary files while the program is running, and which must therefore be in a place which you can write to. By default this directory is called C:\my corpora and Xaira will try to create it if it needs to: if it fails to do so, this message may appear. To circumvent this problem, proceed as follows:

. create a directory for Xaira to use (for example you might call it c:\My Documents\My Corpora) Start Xaira from the program menu rather than by clicking an xcorpus file. Only three menus will be available and none of the query tool buttons will work.

. Select Preferences from the View menu. In the box at the bottom labelled System corpus root type the path to the directory you created (e.g. c:\My Documents\My Corpora)

. Press OK
. Exit Xaira.

2.Q: Can Xaira run on old windows systems such as Win 98 and ME?
A: Xaira comes in two pars: the Client program and the Indexer tool. The Client can work on old systems with already indexed corpora but the Indexer tool cannot - it requires Window NT, 2000 or XP.

When you try to start the Xaira client under Windows 98, a message appears that says "The Xaira.exe file is linked to missing export SHELL32DLL-SHGGetFolder Path". This is due to an incompatibility in the installed libraries which will be fixed at the next release. Note however that the Xaira Tools utility depends on system routines not available for Windows 98. As even Microsoft no longer supports Windows 98 we would advise you to change operating system.

On old systems, you will also need to download the Arial Unicode MS font from the Internet to display concordanced properly.

3. Q: When I make a query in the Client, the system reports the following error: "sever rejected query syntax", and "parser error: an exception occurred! Type: Runtime exception! Message: could not open DTD file: C:\DOCUMEN~1\...". What should I do?
A: When Xaira starts up it writes a copy of the DTD for cql queries into a temporary directory. It uses this to validate each query as it is entered. If the DTD cannot be found then a parser error results.

To solve the problem

1) please check your temporary directory settings.
Look in your environment variables - right click on My Computer to get the System Properties dialog, click the Advanced tab and then press the button marked Environment variables. Make sure that TEMP, TMP or USERPROFILE is included.

2) Make sure your Windows logon name does not include Chinese characters.
Xaira doesn't directly use your user name. But the Client is a Win-98 compatible program. Unlike indextools, which requires some flavour of NT, it is *not* fully Unicode enabled. By using the ICU components it can display Unicode in hits but things like file names must be ASCII. We have deliberately kept Xaira to a low spec recognising that there are still a lot of Win-98, ME etc users around.

On my PC the name of the temporary directory begins with \Documents and
Settings\ and then uses my user name. If your user name uses non ASCII
characters the TMP directory name on your PC may be non ASCII and that would
certainly prevent Xaira from accessing it.

4. Q: The concordance results are displayed in small white squres instead of Chinese characters. What can I do?
A: This only occur the first time you run Xaira. Go to View - Font and select Arial Unicode MS or a chinese font. The next time you run Xaira, the program has remembered your preference.

[本贴已被 xujiajin 于 2005年08月06日 02时14分55秒编辑过]

xujiajin · 2005-08-06

太好了。还是整理出来的好。
毕竟你是这里最专的家。

xiaoz · 2005-09-06

Q: I have tried using other annotation
software on the Web to segment the characters into their combinations before
using The Xaira Indexing Tool Kit to set up the Chinese corpus, but the "Word
Query" still only shows individual characters in the frequency list. Is there
some other process that I need to go through to enable "Word Query" to display
"words" in Chinese?

A: The Xaira Indexer follows the Unicode tokenisation rules, which by default treat each and every Chinese character as a token or "word". Without proper markup, the indexer ignores white spaces between the tokens you have already inserted using some tool. You must insert an pair of open and end tags for each token, as in

<TOK>XXX</TOK> <TOK>YYY</TOK>

(or any XML element name you like)

if you do not POS tag your corpus and format it as

<w POS="n">noun</w> <w POS="v">verb</w>

To POS tag your corpus, you will need a tagger. As you already have a tokenised corpus, inserting token tags is quite straightforward using a few lines of Perl scripts. Or if you do not programme, you can use Word or some text editor. Just Replace All one or more white space with the sequence </TOK> <TOK> and then remove the first instance of </TOK> and insert the last instance of </TOK>.

The corpus processed in this way can be indexed by defining "word break" in "special tags" as TOK. When you open the indexed corpus in the client, you will have a list of words as your tool defined instead of characters.

Xaira FAQs

xiaoz

永远的超级管理员

xujiajin

管理员

xiaoz

永远的超级管理员