请教如何用已经赋码的文本搜索出所有get+adj,/pp的结果呢?

#1
这是一部分已赋码的文本,我也不知道怎样才能检索出来,看了前面的帖子,还是不会编方程式呀,头疼~~~
According_PRP to_PRP Joseph_NP0 Fletcher_NP0 ,_PUN it_PNP includes_VVZ
people_NN0 getting_VVG incurable_AJ0 dease_NN1 [_PUL fm1_UNC ,_PUN -_PUN
]_PUR and_CJC people_NN0 in_PRP a_AT0 helpless_AJ0 condition_NN1 ,_PUN
such_PRP as_PRP trapping_VVG in_PRP a_AT0 blazing_AJ0 fire_NN1 ._SENT
-----_PUN
[_PUL sn8_UNC ,_PUN s_ZZ0 ]_PUR In_PRP China_NP0 ,_PUN suicide_NN1
is_VBZ legal_AJ0 ,_PUN which_DTQ means_VVZ ,_PUN people_NN0 are_VBB legal_AJ0
to_TO0 kill_VVI themselves_PNX in_PRP a_AT0 helpless_AJ0 condition_NN1 ,_PUN
so_AV0 what_DTQ we_PNP conside_NN1 [_PUL fm1_UNC ,_PUN -_PUN ]_PUR
is_VBZ only_AV0 whether_CJS it_PNP is_VBZ legal_AJ0 to_TO0 end_VVI the_AT0
life_NN1 of_PRF a_AT0 [_PUL np7,1-_UNC ]_PUR incurable_AJ0
patient_NN1 ._SENT -----_PUN
I_PNP am_VBB in_PRP favor_PRP of_PRP the_AT0 legalization_NN1 of_PRF
euthanasia_NN1 ,_PUN though_CJS some_DT0 others_NN2 against_PRP it._NN0
[_PUL sn8_UNC ,_PUN s_ZZ0 ]_PUR Those_DT0 who_PNQ against_PRP it_PNP
agues_VVZ [_PUL fm1_UNC ,_PUN -_PUN ]_PUR that_DT0 euthanasia_NN1
is_VBZ inhumane_AJ0 ._SENT -----_PUN
It_PNP is_VBZ a_AT0 false_AJ0 argument_NN1 ._SENT -----_PUN
Death_NN1 ,_PUN most_DT0 of_PRF the_AT0 time_NN1 ,_PUN is_VBZ the_AT0 end_NN1
of_PRF long_AJ0 suffering_AJ0 period_NN1 ._SENT -----_PUN
With_PRP the_AT0 advanced_AJ0 medical_AJ0 techinique_NN1 [_PUL fm1_UNC
,_PUN -_PUN ]_PUR and_CJC equipment_NN1 ,_PUN human_AJ0 life_NN1 can_VM0
be_VBI extended_VVN ._SENT -----_PUN
On_PRP one_CRD hand_NN1 ,_PUN it_PNP is_VBZ in_PRP deed_NN1 a_AT0 good_AJ0
thing_NN1 to_TO0 provide_VVI people_NN0 health_NN1 when_CJS the_AT0
diseases_NN2 are_VBB curable_AJ0 ;_PUN on_PRP the_AT0 other_AJ0 hand_NN1 ,_PUN
it_PNP is_VBZ rather_AV0 a_AT0 bad_AJ0 thing_NN1 to_TO0 extend_VVI people_NN0
's_POS suffering_NN1 when_CJS the_AT0 diseases_NN2 are_VBB incurable_AJ0
 
#4
回复: 请教如何用已经赋码的文本搜索出所有get+adj,/pp的结果呢?

我……应该不是您前辈。
具体讲下你的问题?
 
#5
回复: 请教如何用已经赋码的文本搜索出所有get+adj,/pp的结果呢?

我……应该不是您前辈。
具体讲下你的问题?
哈哈,语料库的“前辈”。我在写学期论文,关于CIA的,我想要检索出get+adj./pp在两个小数据库中的所有结果,但是我不会编码,文本我已经用treetager赋码了,但是不知道咋搜出来咧
 
#6
回复: 请教如何用已经赋码的文本搜索出所有get+adj,/pp的结果呢?

问题还是没说清!这个adj./pp代表什么意思,adj或pp?语料也不够!
试试这个:get(\w+)?_\w+ (\w+_AJ.|\w+_PR.)
 
Last edited:

李亮1975重庆

语料库快乐军政委
#7
图文教程《语料检索的速成教程:已经词性赋码语料的检索》

http://www.docin.com/p-466643054.html

我测试了,在antconc中输入“get_* *_pp”与"get_* *_adj",不用其他任何设置,检索就是你要的检索结果,前提是你的文本文件的文字编码与antconc的默认的文字编码一致,总之是我上面教程中都详细教了演示了的操作
 
#8
回复: 请教如何用已经赋码的文本搜索出所有get+adj,/pp的结果呢?

Why should we avoid using regex since the text is tagged? You may like the PatternBuilder developed by Prof. Liang. Highly recommended. For the regext, try this:
\b(get\w*|got)_VV\w*\s\S+_(JJ\w*|IN)\s

Please be noted that this can not exclude unwanted ones like "get nice books". You may do the two patterns separately.
 
Last edited:
顶部