东北大学分词工具 如何使用?

xusun575

高级会员
#1
The unzipped package contains only one *.exe fil,eneucsp.exe, which is a DOS version. How can it be used under Windows? Thanks.
 

xujiajin

管理员
Staff member
#2
It can be used as a plugin of ACWT.
http://www.corpus4u.com/forum_view.asp?forum_id=38&view_id=798&page=5
 

xusun575

高级会员
#3
Where and How can it be used a plugin in ACWT? I can not underssand the following the direction :


6) NEUCSP 东北大学自然语言实验室汉语分词器 can be downloaded from

http://www.nlplab.cn/cipsdk.html

Install the program to directory
where neucsp.exe and all other system files should be stored.

This program provides Parts of Speech (POS) tagged output for the currently
open file. (In a Windows-DOS console environment, which is not the case here,
it can also handle multiple files.)
 

动态语法

管理员
Staff member
#4
回复:东北大学分词工具 如何使用?

以下是引用 xusun5752005-8-20 20:30:21 的发言:
Where and How can it be used a plugin in ACWT? I can not underssand the following the direction :


6) NEUCSP 东北大学自然语言实验室汉语分词器 can be downloaded from

http://www.nlplab.cn/cipsdk.html

Install the program to directory
where neucsp.exe and all other system files should be stored.

This program provides Parts of Speech (POS) tagged output for the currently
open file. (In a Windows-DOS console environment, which is not the case here,
it can also handle multiple files.)
Did you get the latest Readme.pdf file from the ACWT thread?
You need to read the instructions there.

The neucsp.zip file contains 15 files, and they need to be in
c : \ n e u c s p
The reason I have to leave space between the letters above is that
apparently this web site has trouble with the sequence c : \ n e u c s p
if I put them together. In my post I had to use a picture for the directory
name:
 

xusun575

高级会员
#5
回复:东北大学分词工具 如何使用?

以下是引用 动态语法2005-8-21 8:09:32 的发言:

Did you get the latest Readme.pdf file from the ACWT thread?
You need to read the instructions there.

The neucsp.zip file contains 15 files, and they need to be in
c : \ n e u c s p
The reason I have to leave space between the letters above is that
apparently this web site has trouble with the sequence c : \ n e u c s p
if I put them together. In my post I had to use a picture for the directory
name:
Thank u, but is it available now?
 

动态语法

管理员
Staff member
#6
回复:东北大学分词工具 如何使用?

Ok. Looks like you've got the right files. Put them in the directory
as specified earlier.

If you don't like using NEUCSP under DOS, try using it from within ACWT.
ACWT can call up NEUCSP and make it more user-friendly. That's why I
developed ACWT.

Anyway, after putting NEUCSP in the right directory, download the latest
NoteTab clips for A Corpus Worker's Toolkit.
All the needed files (Readme.pdf, clips, Perl scripts) can be found in
the discussion thread:

http://www.corpus4u.com/forum_view.asp?forum_id=38&view_id=798&page=5

under posting #47.
 

动态语法

管理员
Staff member
#7
回复:东北大学分词工具 如何使用?

If you have done the above, i.e., you got NEUCSP, NoteTab Light, and ACWT
all installed properly on your Windows PC,

run NoteTab Light,

Select !TK_Start -> 01_TxtUtl or go directly to 01_TxtUtl from the left hand side
of your NoteTab Light screen.

Find NEUCSP there, open up a Chinese text, apply
(click on) NEUCSP and see what happens.

You really need to read the ACWT thread on this web site. There are
tons of pictures to show you how.
 
顶部