恳求许博士帮忙

由于我本身不是计算机专业出身,对语料处理非常外行,所以想请帮我解决一下这个问题,最好是编一个专门处理这些赋码文本的程序,好吗?以下是需处理文本:

<WCOMP><ARG><GRADE1><YEAR03><TIMED><SCORE><ND><LENGTH347W>
Education as a Lifelong Process
Why do we send young kids to school, <IC> making some of them tortured and complaining <E> sadly <RDC>? <T> Maybe it is good for them to hunt <E> good jobs <RDC> in the future or get prepared for all aspects of life <IC>? <T> In my opinion, the function of education is <IC> beyond a means of earning a living. <T> Some people think <IC> education is merely the defination <E> of going to school or college <DC>. <T> Is that true <IC>? <T> Of course not. <NT> We have lots of ways to get educated, such as self teaching, using transporting tools and communicating with others <IC>. <T> Even living it self <E> is <IC> a way of learning, during which <NDC> you learn how to arrange your time <E>, handle incidents and the art of living <RDC>. <T> The reason some people consider education simply as going to school <NDC> perhaps <E> is <IC> they, <E> regard it as a means of job hunting <NDC>. <T> On the one hand, we have to admit <IC> one function of education surely is <E> for hunting <E> good jobs <RDC>. <T> The more educated <E>, <IC> the more opportunities lie in front of you <IC>. <T> On the other hand, human <E> do not live <IC> to eat <RDC> but eat to live <RDC>. <T> We can not be taken the nose <E> only by physical demands <IC>. <T> Still we have to enjoy ourselves, and fulfill our goals <IC>. <T> When you learn something without a much too dear aim <NDC>, you will get it light heartedly and keep it in mind groundly <IC>. <T> You may learn it in a more charming atmosphere <IC>, <T> that is to say, you gain something <IC> while enjoying yourself <RDC>. <T> That <E> is more efficient and long lasting <IC>. <T> The most essential function, I guess, lies <IC> in helping <RDC> you know mow <E> about yourself, developing <E> your potential and adjusting <E> in time. <T> There are <IC> many unknown districts in a person to be ploughed <RDC>. <T> Through education you many touch <IC> them by chance and be aware <E> for the first time <NDC> what you are good at. <T> Then you will do it more aimly and guidedly <IC>. <T> Be familiar with yourself <IC>, which is the most valuable <NDC>. <T> As the above, education goes <IC> along with your lifetime, known or unknown. <T> Don't resist it <IC>, <T> or else you may lose precious chances <IC>. <T>

<>中为人工赋码代号,T为T-unit,NT为non T-unit, C为clause, NDC为non-reduced clause, IC为independent clause, RDC为reduced dependent clause,DC为dependent clause,E为error,EFT为error-free T-unit,W为word。能不能使用相关检索软件查找出这些单位的个数?然后再计算出W/T, W/C, W/EFT, C/T,DC/C,EFT/T,E/T,RDC/C,RDC/DC这些比例,最后以Excel表格的形式输出,并注明文本前注释的相关信息。
由于现在没有自动处理句法的软件,所以只能先手工赋码,然后再检索,我想这应该是个比较好的课题。但是,可惜的是,虽然我对这个很感兴趣,但由于计算机专业知识有限,只能请教博士了。恳请指教!麻烦了。
 
回复: 恳求许博士帮忙

<>中为人工赋码代号,T为T-unit,NT为non T-unit, C为clause, NDC为non-reduced clause, IC为independent clause, RDC为reduced dependent clause,DC为dependent clause,E为error,EFT为error-free T-unit,W为word。能不能使用相关检索软件查找出这些单位的个数?然后再计算出W/T, W/C, W/EFT, C/T,DC/C,EFT/T,E/T,RDC/C,RDC/DC这些比例,最后以Excel表格的形式输出,并注明文本前注释的相关信息。

用AntConc
http://www.antlab.sci.waseda.ac.jp/software/antconc3.2.1w.exe
分别检索
 
回复: 恳求许博士帮忙

谢谢许博士,但是由于一次性要处理很多文本,所以分别检索会麻烦些。能不能编个一次性处理类似文本的程序,可以一次性将结果呈现?
 
回复: 恳求许博士帮忙

用perl编了一个程序.你用的时候下去下载一个perl,免费的,baidu上能搜到.然后把程序解压到你存放文件的文件夹里,双击程序,你想要的数据就会存成txt文件,你可以用excel打开.
 

附件

  • 处理话语单位.rar
    532 bytes · 浏览: 30
回复: 恳求许博士帮忙

但是由于一次性要处理很多文本,所以分别检索会麻烦些。能不能编个一次性处理类似文本的程序,可以一次性将结果呈现?

可以考虑使用 WordSmith Tool 中的 file-based concordance, 把要检索的项目输入一个文本文件, 每个项目新起一行, 然后导入进行检索.
 
回复: 恳求许博士帮忙

I will write a delphi program for you, it will be easy to use.:)
 
回复: 恳求许博士帮忙

怎么把perl文件做成不依赖perl而可以单独执行的exe文件呢?
 
回复: 恳求许博士帮忙

在 Komodo IDE 集成开发环境中可以把 perl 解释器打包进去。
 
回复: 恳求许博士帮忙

可以快速统计多文本:D
 

附件

  • MyCorpus.rar
    561.5 KB · 浏览: 104
回复: 恳求许博士帮忙

谢谢William,这几天我也试着编了一个程序,在附件里,请大家指正。
 

附件

  • forWWT.rar
    58.3 KB · 浏览: 22
Back
顶部