Any available tools for aligning sentences?

patricx

高级会员
any available tools for aligning sentences/paragraphs?

that's an important step to build a parallel corpus. how can i continue my job without aligning sentences?
 
对了,我刚才注意到WordSmith里面好像有一个aligner,不知道管不管用,你试试吧。
 
i checked the WS4, and haven't found this fuction, which version of Wordsmith did u use, Dr.xu?
 
WS3 and WS4 both have an aligner. You can find in the Utilities menu of the Main window viewer and aligner in WS4.
 
Yes! aligner in WS3 works perfect for English. Thank u DR xu for this very important piece of info.
 
thanks Dr.xu, that's a good news. but i have tried it several times in order to align some sentences, but failed to do so. maybe i need to learn a lot.
 
回复:Any available tools for aligning sentences?

以下是引用 xusun5752005-8-25 17:13:45 的发言:
Yes! aligner in WS3 works perfect for English. Thank u DR xu for this very important piece of info.

could you upload several screenshot?
thank u very much!!
 
回复:Any available tools for aligning sentences?

以下是引用 patricx2005-8-25 15:02:06 的发言:
any available tools for aligning sentences/paragraphs?

that's an important step to build a parallel corpus. how can i continue my job without aligning sentences?

[本贴已被 xujiajin 于 2005年08月25日 15时03分19秒 编辑过]

Thank u Patricx, for this posting of yours!
 
回复:Any available tools for aligning sentences?

Dr.xu, Thank u very much!
this is what i have got:
2005082520460435.jpg
 
回复:Any available tools for aligning sentences?

以下是引用 xusun5752005-8-25 17:13:45 的发言:
Yes! aligner in WS3 works perfect for English. Thank u DR xu for this very important piece of info.

could you pls upload the aligned English text? i can't do it.
 
as far as I am aware, there are no alignment tools that are publicly available for E and C. Some tools can achieve good accuracy, but require special expertise to use (e.g. Unix), or require a lot of preprocessing, both by hand and by programming. I think these are also reasons for their not being made public.

The new version of ParaConc has a built in aligner, but its performance with e and C is not up to my expectation. E and C require aligning algorithms different for many related language pairs. So it"s not easy for generic tools such as WordSmith aligner, ParaConc aligner, and MultiConcord aligner.

But there are some tools available that help with alignment by hand. They make the alignment semi-automatic - e.g. the tool for in-house use at Beiwai, developed by the Institute of computational Linguistics of PUK? (see below)

2005082609411858.jpg
 
yes, but i don't know how to use it, there are many codes and files in it, i know nothing about programming. it's a pity.
 
下面的两个软件是由中科院设计的,好像没有公开啊!!!有知识产权保护的

64 平行语料对齐校对工具 v1.0 200217187 2003SR2286

  本软件的目的是辅助人工进行中英文双语平行语料句子对齐的校对工作,以便提高校对效率并保证工作的准确性,从而加快中英文平行语料库的建设以及相关研究的开展。本软件可以将平行语料中对应的中英文句子对按照合理的方式显示,特别是,系统在对平行语料进行对齐判定后,能将其中可能存在对齐错误的中英文句子对以不同颜色进行标示,从而提醒校对者的注意。并且,系统还为单句以及句子对提供了各种合并、拆分和编辑的操作,以方便语料的修改。本软件的主要特点在于系统对平行语料是否对齐的自动判定功能。针对一组句子对,系统应用对齐判定算法进行判断,由于算法定义的合理性,避免了平行语料中错误句子对的遗漏,同时又可以减少校对者的工作量。 本软件适用使用辅助翻译工具的翻译公司,以及中文信息处理领域中各类依赖双语平行语料进行工作的研究方向,如机器翻译;可用于双语平行语料创建过程中句子对齐的校对工作。

63 平行语料自动对齐软件 v1.0 200217186 2003SR2285

  本软件的目的是为了将文章级对齐的中英文双语语料自动转换为句子级对齐的中英文双语平行语料,以便通过研究双语语料中句对之间的对位关系,获取机器翻译所需的统计知识。本软件接受用户提供的一组对应中英文文章,从中自动找出相互对应的中英文句子,最后按照一定格式输出经过句对齐的文件。本软件的主要技术特点在于系统的自动对齐功能,通过一定判别算法,系统可以从一对中英文文章中找出相互对应的一组句子,从而自动完成句对齐。本软件适用于使用辅助翻译工具的翻译公司,以及各类依赖双语平行语料进行工作的研究方向,如机器翻译;用于在双语平行语料创建过程中,由相对应的文章自动生成句子对齐的双语语料。
 
Back
顶部