[求助] 如何把一个语料库里的relative clause都sort 出来?

xuyi

普通会员
前面看到pied piping可以sort出来,那么relative clause 自然应该也没问题。对这里的高手来说估计这是个小儿科问题吧。但是我不懂:( 只是觉得,应该DP后面that/whose/(prep)which/who/whom, 还有DP DP V VP, 大致这种形式的搜出来的应该是relative clause吧。

请哪位指点一下,用那个software最方便,提示一下我怎么用吧。
Assume wordsmith应该是可以的?下了WS3, 但还是不明白怎么用。

其实之前也问了Biber,这个家伙说上网用google搜;搜到susane,没懂,也不确定。我想还是来请教一下。谢谢各位!
 

xiaoz

永远的超级管理员
Staff member
SUSANNE is a parsed corpus that can be downloaded from Sampson's site

http://www.grsampson.net/Resources.html

But to extract relative clauses, you actually don't need a parsed corpus (which of course will be easier for this task). You can do so using a POS tagged corpus. For example, typical relative clauses can be extracted from CLAWS tagged data using the following file-based search patterns:

1) marked relavice clauses:
*_N* that_CST
*_N* which_DDQ*
*_I* which_DDQ*
,_, which_DDQ*
*_N* who_DDQ*
,_, who_DDQ*
*_N* whom_DDQ*
*_I* whom_DDQ*
,_, whom_DDQ*

2) See http://bowland-files.lancs.ac.uk/corplang/cbls/zipfiles/patterns.zip
for 8 file-based search patterns for THAT-deletion (thatdel1.txt - thatdel8.txt )
 

xuyi

普通会员
Thank you very much!~ But since CLAW is not freely availalbe, can i use GoTagger or something else for the same purpose? (I want to extract the RC in CLEC; so for the trial version of CLAW it's too big.)
 

xiaoz

永远的超级管理员
Staff member
I think you can, but you will need to switech to the POS tags used in GoTagger.
 

xujiajin

管理员
Staff member
回复: [求助] 如何把一个语料库里的relative clause都sort 出来?

1) marked relavice clauses:
*_N* that_CST
*_N* which_DDQ*
*_I* which_DDQ*
,_, which_DDQ*
*_N* who_DDQ*
,_, who_DDQ*
*_N* whom_DDQ*
*_I* whom_DDQ*
,_, whom_DDQ*

上面的表达式不区分名词性从句还是关系从句,特别是*_N* that_CST,检索后还需进一步手工排查。
 

oscar3

高级会员
回复: [求助] 如何把一个语料库里的relative clause都sort 出来?

1) marked relavice clauses:
*_N* that_CST
*_N* which_DDQ*
*_I* which_DDQ*
,_, which_DDQ*
*_N* who_DDQ*
,_, who_DDQ*
*_N* whom_DDQ*
*_I* whom_DDQ*
,_, whom_DDQ*

上面的表达式不区分名词性从句还是关系从句,特别是*_N* that_CST,检索后还需进一步手工排查。
以上方法可以析出关系代词没有省略的关系从句,对于省略了关系代词的关系从句就无能为力了,如Sam found his wallet (which) he had lost。
 
顶部