go tagger标注的语料,如何在ws 3.0下检索被动语态?

melia

初级会员
用BE+VBN(过去分词)检索,WS好象不支持。 请各位帮忙看看,应该怎样检索?谢谢!
 
回复: go tagger标注的语料,如何在ws 3.0下检索被动语态?

请问如何搜索标注文本(POS标注)和未标文本的每一句的第一个单词和第一、二个单词。并将结果显示出来。这里的句子指的是以句号,问号和感叹号结尾的句子。

未标注示例:
Zinedine Zidane remains France's best-loved personality despite his head-butt against Italy's Marco Materazzi in the 2006 soccer World Cup final, a survey showed on Saturday.
Zidane came first in a ranking of France's Top 50 personalities, beating ex-tennis champion Yannick Noah who came in the second place, and leaving singers Charles Aznavour and Johnny Hallyday, as well as actor Gerard Depardieu behind.

标注示例:
Zinedine_NP1 Zidane_NP1 remains_VVZ France_NP1 's_GE best-loved_JJ personality_NN1 despite_II his_APPGE head-butt_NN1 against_II Italy_NP1 's_GE
Marco_NP1 Materazzi_NP1 in_II the_AT 2006_MC soccer_NN1 World_NN1 Cup_NN1
final_NN1 ,_, a_AT1 survey_NN1 showed_VVD on_II Saturday_NPD1 ._.
Zidane_NP1 came_VVD first_MD in_II a_AT1 ranking_NN1 of_IO France_NP1 's_GE
Top_NN1 50_MC personalities_NN2 ,_, beating_VVG ex-tennis_JJ champion_NN1
Yannick_NP1 Noah_NP1 who_PNQS came_VVD in_II the_AT second_MD place_NN1 ,_,
and_CC leaving_VVG singers_NN2 Charles_NP1 Aznavour_NP1 and_CC Johnny_NP1
Hallyday_NP1 ,_, as_II31 well_II32 as_II33 actor_NN1 Gerard_NP1 Depardieu_NP1
behind_RL ._.

谢谢!
 
回复: go tagger标注的语料,如何在ws 3.0下检索被动语态?

It is not easy to do such an operation using standard concordancers unless more markup is avaialable - if you programme, it's a little piece of cake. But you try take advantage of Excel. Open your text file with Excel, defining the delimiter as the white space. Then you select the first two columns of the spreadsheet.
 
回复: go tagger标注的语料,如何在ws 3.0下检索被动语态?

The problem is that when a .txt file is opened with EXCEL and defined the white space as delimiter, the whole file with many sentences is just displayed in a line, which can not achieve the purpose.
 
回复: go tagger标注的语料,如何在ws 3.0下检索被动语态?

You may need set your sentence boundaries first. For example, open your text in EditPlus, and make it each sentence is in one line. (following our previous discussion in another thread, you should know how to use EditPlus to achieve this).

Once you got your text in one line one sentence format, it should be easy to search what your want in EditPlus too.
 
回复: go tagger标注的语料,如何在ws 3.0下检索被动语态?

I see, the key is to segement the text into sentences.
thanks Laohong,Dr.Hong has always been insightful.
 
回复: go tagger标注的语料,如何在ws 3.0下检索被动语态?

It isn't all to easy to complete such an procedure utilizing normal concordancers except if much more markup is usually avaialable -- in the event you plan, it's really a small piece of cake. However you try out exploit Exceed. Open up ones text message record having Exceed, determining this delimiter since the white living space. Then you certainly simply select the primary a couple columns from the spreadsheet.
 
Back
顶部