
  1. W


    回复: 国内语料库建设一览表 这个单子挺全的,谢谢Dr. Xu费心。
  2. W

    Open corpus platform

    回复: Open corpus platform I would wish you still say that after you have tried it. For the platform, we are suggesting the following arguments: 1) What texts should be involved in a corpus? Any texts that meet the demands of the user. There is no such a corpus that can meet the demands of...
  3. W

    Open corpus platform

    回复: Open corpus platform We are preparing the help files currently. Once done we'll provide some trial user ids and pwd so that those interested could try their hand. The Ocorp was designed as a workbench on which the users design their markup scheme and construct their collection of texts. I...
  4. W

    Open corpus platform

    Hi Pals, We are developing an open corpus platform that allows any potential user to construct his or her own corpus with annotations for data retrieval and kwic analysis. We are planning to put it on a pressure test. You are welcome to join us. Cheers, WZ
  5. W


    回复:请问语料库是注重语言运用,而不是语言形式的说法对吗。 Please consult: Sinclair, John (1987) (ed.) Looking Up An account of the COBUILD Project in lexical computing and the development of the Collins COBUILD English Language Dictionary. London: HarperCollins Publishers.
  6. W


  7. W


    Sure we could obtain the frequency data and even the normalized frequency from plot. But I really feel Liang's way of doing is also quite neat, particulary the excel part.
  8. W


    在2006年7月北京的“语料库在外语教学和研究中的应用”研修班上,有几个新的研究思路值得关注: 1)梁茂成博士提出的对给定语篇统计word cluster词表,并利用该词表对其他文本进行批量检索,这样每个索引行都有一个对应的文件路径,把文件路径列表通过EXCEl处理读入到SPSS中进行频率统计,可得到某一word cluster在一个单篇文本中的频数,如果把这些文本中其他对应的参数输入,可做相关分析或差异检验。梁博士一个更高级的做法是,只抽取这些cluster的语法模式,如N + N + V + Adjective + N,分析的意义会更大。...
  9. W


    One can also try hotpotatoes for any exercise generation at http://www.halfbakedsoftware.com/hot_pot.php. It's free after one has registered.
  10. W


    It's terrific to have this pick'n poke. That helps a lot for further improvement of COLSEC. Please find more inconsistencies so that we can update the corpus accordingly. Dr Xiao, be careful when you try to tag COLSEC, because it has so many non-words. Maybe one needs to check manually for...
  11. W


  12. W

    [转帖]Corpus Linguistics Week

    外地的朋友可以选择听后面的几场。前面两场都是基础性的东西,主要面对初学者。周末另有一些安排,很抱歉。 我们尽量与讲演者商量,把PPT放到网上。录音的事到时候我们跟作者商量。不过前两场就饶了吧,心里发虚啊。 Wolfgang在北京、天津科技大学先讲,大家可以就近去听。
  13. W

    CLEC and COLSEC by wzli

    谢谢xujiajin把这个贴到这来。 有个打错的地方:COLEC应为COLSEC, 意为:College Learner Spoken English Corpus。
  14. W

    [乱弹] 语料库技术讨论

    回复:[乱弹] 语料库技术讨论 Colsec的documentation及初步研究,附赠光盘已经交上外出版社出版,这会儿该出来了吧。 CAST在线版演示可在http://corpus.sjtu.edu.cn/DDL/INDEX.HTM 中corpus online search找到。但这个检索系统用的是另外一个服务器,可能经常关闭。
  15. W

    [乱弹] 语料库技术讨论

    一点回顾 一点回顾 CLEC从1996年开始着手,1999年初成,真正整理完工在2001年左右。大约有7、8所高校、几十人参加,从抽样、手工输入、校对、附码到最后集成,工作机械而繁重,且基本属于义务劳动,其中甘苦不足为外人道。这个课题虽说是国家课题,经费也就万把元,别说劳务费,连课题组开会研讨,都是自掏腰包。...