关于英汉平行句子的单词对齐

预料库的对齐算法大都是句子级别的。
小弟琢磨着怎么实现单词级别的对齐算法,大家不妨一起讨论一下啊,众人的力量才是强大的~~呵呵
我先抛出一个问题啊。。
she is a beautiful girl in the beautiful world!
她是这个美丽世界里的一个漂亮女孩。

在对齐中,怎么利用位置信息把“美丽”对齐到第二个beautiful,而把“漂亮”对齐到第一个beautiful?
 
回复: 关于英汉平行句子的单词对齐

Contextual infromation such as collocations in source and target languages might be of help in such cases.
 
回复: 关于英汉平行句子的单词对齐

Contextual infromation such as collocations in source and target languages might be of help in such cases.
利用上下文信息,比如“美丽”修饰“世界”。第二个“beautiful”修饰 “world”
只要先找把“世界”对齐到“world” ,就能把“美丽”对齐到“beautiful” ,isnot it?
但是问题远不是如此简单。
例:
这个美丽女孩比那个美丽女孩聪明。
this beautiful girl is clever than that beautiful girl.
后面的上下文信息也比较相似,如何处理呢?
 
回复: 关于英汉平行句子的单词对齐

Word alignment is of course much more complicated than that. The following paper might serve as a good starting point for research in this area:

Piao, Scott Songlin (2002) Word Alignment in English–Chinese Parallel Corpora. Literary and Linguistic Computing 17(2):207-230.
 
回复: 关于英汉平行句子的单词对齐

Word alignment is of course much more complicated than that. The following paper might serve as a good starting point for research in this area:

Piao, Scott Songlin (2002) Word Alignment in English–Chinese Parallel Corpora. Literary and Linguistic Computing 17(2):207-230.
可否麻烦Xiaoz上传此文?谢谢!
 
回复: 关于英汉平行句子的单词对齐

Thanks a lot, Xiaoz and Jiajin!
 

附件

  • 221770ADB71B4D3AB6E0B156AB07B967.GIF
    221770ADB71B4D3AB6E0B156AB07B967.GIF
    10.4 KB · 浏览: 64
Back
顶部