[讨论]CLEC的瑕疵都有哪些呢?

我是台淼摹R樘尺]有中人的W者Z料欤所以我得知CLEC的存在r相高d。透^各N管道,K於纳虾YI到。想用CLEC完成我的T士文,s]想到我文I完成後_始分析Z料,slFCLECe的擞真的e`百出。m然我K]有研究所有的tags,但是就我的研究说亩言,正_率真的大有}。我τ沙淌脚艹淼难芯拷Y果失去信心,只好用人工方式,一字一字的挑e。
@既然是一花了@N多X及人力建立的Z料欤如果e`太多,如何作研究工具呢?
 
对不起,我不会引用回贴,只有这样引用了。

19楼

以下是引用 dinooja 在 2006-3-17 13:50:02 的发言:
说“长辈也不要忽悠小辈”可能言重了――我想杨教授和桂教授肯定是想把该语料库作完美的。
出现的一些问题,正好是我们小辈、后学继续努力的地方。

请注意:没有人说杨教授和桂教授在忽悠小辈!前文所说的是就一般情况而说,学术界小辈们被忽悠的事例还少吗? 没有必要在这里都抖落出来吧?


我想表达的观点是:
1. 说“长辈也不要忽悠小辈”可能言重了
2. 杨教授和桂教授肯定是想把该语料库作完美的
3. 出现的一些问题,正好是我们小辈、后学继续努力的地方

我的表达没有任何诸如“没有人说杨教授和桂教授在忽悠小辈!”的引伸义。如若有误解,恕我没有表达清楚。其实我昨天还在与楼主交流,对他指误表示敬佩。

另外想补充一点:我的上述第1点表达主要基于如下认识

――学术批评仅限于学术批评,多言其它无益。
 
CLEC被动语态的错误: vp7是指Voice方面的错误,可是,请问大家,以下三例ST5的句子是被动语态的错误吗?可我怎么看也看不出是被动语态有问题呀?象这样的数据应当从收集的数据中去掉吗?!
1) Just imagine [vp7,1-4] what will it be without electricity.
2) Just imagine [vp7,1-5] what would happen to us if there is [vp8,2-2] no detergent?
3) Imagine [vp7,1-3] what would happen if suddenly there was no electricity?
 
我们不是工厂的质检人员,找到质量问题后自由他人解决。学术界有很多人,做了一辈子找问题的工作,却从未提出解决问题的行之有效的方法,这种人不作也罢!!
 
回复:[讨论]CLEC的瑕疵都有哪些呢?

以下是引用 windyyw2006-3-21 18:05:48 的发言:
CLEC被动语态的错误: vp7是指Voice方面的错误,可是,请问大家,以下三例ST5的句子是被动语态的错误吗?可我怎么看也看不出是被动语态有问题呀?象这样的数据应当从收集的数据中去掉吗?!
1) Just imagine [vp7,1-4] what will it be without electricity.
2) Just imagine [vp7,1-5] what would happen to us if there is [vp8,2-2] no detergent?
3) Imagine [vp7,1-3] what would happen if suddenly there was no electricity?

当然应该去掉!
 
接主题二:错误标注


In the Cultural Revolution, his family background brought him persecution. He was barred in a barn for ten years. During the years, he read some books that he had hidden secretly. It was books that helped him go through those hardest days. Only when he came to my mother's factory in the 80's could he continue his work and study. He said to me, "Some people think I'm a bit foolish. They don't understand why I keep on working instead of enjoying the days I have left me [sn8,s-]. And since I am quite old now, it seems unworthy to spend so much money on books. You know that the Cultural Revolution has [vp6,6-4] (时态没有错误,去掉ME 即可。) taken me ten years. I feel lucky that I could go through it, moreover, I still have the ability to work now, so why shouldn't I work harder to make up for the lost ten years[sn1,s]love books and one is never too old to learn. I give books to children because I hope they will love books and realize the importance of books too. Remember: You young people is [vp3,3-5] the future of the country. It is very important for you to learn as much as possible, then you can serve the country." Hearing his words, I came to understand him. I saw a noble soul. He is the man I respect very much.


<ST 5><SEX 2><Y 8><SCH Zhongda><AGE 20><WAY 1><DIC 1><TYP 2>
My Grandmother My grandmother is a kind peasant woman. She spends most of her life in the village. Though she is little [wd3,2-1] educated, she has a good sense. I used to live with her and my counsins [fm1,-] when I was small because my parents were too busy to look after me . She takes good care of her grandchildren and gives them every comfort. They all love her and she loves them also. My grandmother had too much to do in running a house. As our family was not wealthy, she had to do everything herself. She got up the earliest in the morning and sleeped [fm2,-] late at night. She fed the pigs, chickens, ducks and geese, and watered the vegetables. She prepared the meals and washed the clothes. In the harvest season, she must help in the field. She worked hard yet [wd5,3-2] without complaining. She is also a thrifty and industrious woman. She saves every penny as she can. As she had been [vp6,2-6,t7-t3] busy ever since she was young, she looks older than she really is. Her face is wrinkled, her hair turns silver white, and some of her teeth become movable [wd3,3-] . But she works as hard as ever. Though her grandchildren have grown up and the family condiction [fm1,-] has improved, she refuses to do nothing. Now she is still raising a group of chickens and planting a patch of vegetable. Often she says to us, "Work while you work, play while you play. If you do not work, you will become lazy and of no use." What good advice she has given to us! She had just been [vp6, 3-8] (时态没有错误) to school for several years but she taugh [fm1,-] us how to act through her experience. We must live up to it and always keep it in mind.


Dear Tom, I have [vp6,0-6,t3-t5] came to see you at 3. 00 p. m, but you aren't [,10- 0] in. What a pity it is. Because we haven't seen each other for ages since you went to Beijing the month before last. I have many things to tell you, and I think you say [fm1,-] , too. I will be glad o hear your stories on your trip. Since you are not in, and I am not sure when you will be back, I wrote [vp6,18- 0,t5-t3] this note . I hope you can give me a phone-call soon after you have [vp6,11- 0,] (时态没有错误)read this. I'm looking forward to seeing you. Good-bye. Yours. Dear Sir: I used to be a resident in Dome Street 10. A week ago, I bought a new house in Tomson Street 23 and moved into it.


<ST 5> <SEX 2> <y 10> <SCH WSD> <AGE 20> <WAY 2><DIC 2> <TYP 1>Advertising has been more and more prosperous in recent years. We can see many bill-boards along the streets, or a lot of classified advertisements on TV, even a full-page advertisement in newspapers. Advertising has been [wd3,3-6] a part of our everyday life [np6,6-0]. The university is [wd4,1-2] main place to cultivate [wd3,1-1] knowledgable [fm1,-] and law-abiding [fm1,-] citizens. however [fm3,-] , we do not feel so secure when here [wd3,1-s] arebicycle theft, clothes theft, [wd4,6-3] books theft occurring, how could we feel sure about building a secure and civilized society. Friendship is like sunshine in our life [np3, 2-0] . it gives us happiness [sn9,-] encouragement and inspiration. I think that friendship is the biggest wealth in our life [np6,3-0] . I have always had [vp6,4-9] several good friends (时态没有错误 in 改成 during ) in the different periodS of [wd4,5-1] MY life. Before going to school, I played a lot with Sunny, who was a bright and helpful girl. Under her influence, I became more cheerful and intelligent .In middle school [sn9,-] I made friends with Kate [sn9,-] who was sanguine and forthright [wd7,5-0] . Kate was my best admirer and severest [aj3,1-1] criticizer [wd2,1-0]. She encouraged me when I failed and pointed out my faults bluntly [fm1,-]. “IN” IS MORE NORMALLY USED THAN “WITH” with the company [fm1,-] of my good friends, I have lived [vp6,10-5] (时态没有错误 去掉my) a happy and full life.whenever I face difficulties, my friends are there for me. their love provides me with support, their help has solved many of my problem [np3, 4-0] . I treasure the friendship I have. I also realize that friendship doesn't only mean receiving. Understanding lienience [fm1,-] and selflessness [fm1,-] are [np6,5-6] important elements to cultivate [wd3,3-3] real friendship.


<ST 5> <SEX 1> <Y 8> <SCH WSD> <AGE 20> <WAY 2> <DIC 2> <TYP 1> Many young people in China are longing for the chance to study abroad because which [pr2,s-] can not only benefit them a lot but also do good to our country. First, studying abroad is the precious opportunity for us to improve our study in some course. Though our country has developed [vp6,s-] (时态没有错误 去掉Western前的 “the” , reaps→leaps system→systems)by reaps [fm1,-] and bounds, we still can't keep up with the Western countries in some aspects, such as management system, science and education system. Since we're the [np7,-4] part of the world, we must build up good relationship between other from all over the work [wd3,7-] . Secondly, every year our government sends some learners to study abroad in order that they will make their contribution to our country after their return. Qiang Xueshen , a famous expert in missile, is devoting [vp2,-3] to our society in scientific field after his return from America.


< ST 5> <SEX 2> <y 10> <sch wsd> <AGE 20> <WAY 2><DIC 2> <TYP 2> My most memorable Mid-Autumn Festival night Usually, Mid-Autumn Festival night was all the same, full of joy. But, I have a special night on that day.[,s] when [fm3,-] I was ten year [np6,1-] old, I was sent to [wd4,-1] countryside, stayed in[wd3,-] my uncle's. Approaching the Mid-Autumn Festival, I was still [wd7,6-] happy, but on that night, everything changed. When my favorite food were [vp3,1-] set on [wd4,1-] table, everybody was ready to enjoy them [pr4,1-], I was sad at the sight of the moon, because I missed my family very much. So [wd5,-s] , at [wd5,-s] that night I was thinking about how lonely I was without my family around me to share the food night [wd7,-s] .Even [wd5,-s] tears came into my eyes. I couldn't tell how sorrow [wd2,-2] I was! Time flies, [sn9,-] that night is still my most memorable d-Autumn Festival night. I used to enjoy the Mid-autumn Festival night together with my parents. But [wd5,-s] this year, I don't [fm1,-] . I have [vp6,-s] (此处 时态没有错误)stayed in the campus. Because I have had [vp6,-s](去掉had) an appointment with a young man with whom I almost fall [,-s] in love[sn2,s-].
I was very excited [fm1,-] and a little nervous that night. I'd never made an appointment with a young man at night before. I'd never fallen in love with a man before. On that night, we sat by the west Lake in [pp1,-2] the campus, looking at the bright full moon [sn9,s-] talking. We talked about our childhood [np6,1-], our present life [np6,2-] , our family members and Chinese literature about moon. To my joy, we are [fm3,-] both keen on Chinese literature and we have [,8-] much in common.

So I left home to one of my friends home the next day. I spent more than 10 days with my friend and when I came back, I found that my parents lost,-s] weight a lot [wd1,2-] and they looked older than before. I got confused and wondered what had happened at home. Later I got to know that they haven't [vp6,-s,t3-t7] gone [wd3,-s] to sleep for several days because they were worries [wd2,1-] about me and tried to ask for the [wd5,-s] information about my university entering [cc1,1-] . After knowing that , I got shock [np7, 1-] . Because they never asked about my study and my life and my future before and when I asked them for advice, they only said 'decide it by yourself'. [sn1,s]I had doubt [,-s]whether I was their daughter. By then, I knew they've been loving me, only they are not good at expressing their emotion and tried to make me independent. I went up and hug [,-s] them with teas [fm1,-] . When I was 15 years old, I went to visit my aunt who lives [,-s] in a city far away from Canton. It's [vp6, -10] the first time I have travelled [vp6,-s] (时态没有错误,along →so far away)along [wd3,1-].

(以上语料均来自 ST5)
 
回复:[讨论]CLEC的瑕疵都有哪些呢?

以下是引用 iamkys20032006-3-17 16:10:42 的发言:
我是台淼摹R樘尺]有中人的W者Z料欤所以我得知CLEC的存在r相高d。透^各N管道,K於纳虾YI到。想用CLEC完成我的T士文,s]想到我文I完成後_始分析Z料,slFCLECe的擞真的e`百出。m然我K]有研究所有的tags,但是就我的研究说亩言,正_率真的大有}。我τ沙淌脚艹淼难芯拷Y果失去信心,只好用人工方式,一字一字的挑e。
@既然是一花了@N多X及人力建立的Z料欤如果e`太多,如何作研究工具呢?
Your comments will be very likely helpful to the possibly future improvement of CLEC.
 
我开始怀疑是否有必要对学习者语料库进行error tagging。

原因:

1. 因为学习者的错误有各种各样的分类,在标注时难免挂一漏万(有些夸张);
2. 机器根本没有办法标注,这是经过很多人证明的。即使用机器标注了,后续的校对还需大量的人工;
3. 如果采用人工方法标注,个人理解标注标准的程度不一样,使结果很不一致,况且所费的人工和银子相当惊人;
4. 即使是标注好的语料库,所有的使用者都要按照标注体系来做研究,极大地束缚了研究者的手脚。

建议:

学习者语料库应仅做词性附码处理。目前,无论是probabilistic还是rule-based附码器对学习者语料库做的词性标注的准确率都在95%以上,足以作为一个可靠的数据来源。这样每一个研究者可以根据自己的方法去检索学习者的错误,如n-gram等。当然,有人会说,词性附码也是框框,但是,这个框框要比error tagging的框框大得多。而且,词性附码是有计算机自动完成的,省时、省力并且不用人工干预。

请大家批评。^=^
 
我相同意jackzch的f法。的_每研究工作都不可能完全]有},事上只是接近真理距x的h近不同而已。氖Wg研究不正是一l不嘣阱e`中追ふ胬淼穆幔
可是我X得CLEC在十分可惜,因槌了@些嗽]e`外,他在是一相不e的Z料臁N抑所以X得有必要⑺的e`提出碛,是因檫@些e`以我看恚大部分都是毫o理由的。我不知道CLEC是人工做Tagging是用自踊程式做的,但是似乎K不像是因slgg6985所言,是因W者e`嗽]每人的判x可能不同。有太多的例子,是人X得不知所云的。
除此之外,有另一相乐氐},就是『重}性太高』。S多是W生的作文或作I,都是被定同一}目,但不知樯叮热菥勾蟛糠窒嗤,B句子都一印D翘Q有多少字的Z料欤H上的字挡皇打打折了幔窟有,我lF有些句子是不完全的,只有_^字就嗔耍不知道@些热菔欠褚除去呢?
 
Cite SLGG "即CLEC是结构性问题。我认为这是一个致命性(fatal)的错误,使该语料库得出的数据极不可靠,使很多硕士生的论文在收集数据的初始阶段就已经注定那些论文是构建在不可靠的数据的基础上。以权威的面目出现,误导了一批学子。一种母语背景的学习者语料库根本没有必要搞得这么大(ICLE每个子库只有20万tokens)"
Want more discussion: I don't really think it's fatla. It's probably not comprehensive enough, but any research relies only on samples, not to mention there are case studies. The problem is probably how much you can generalize based on the data given.
Don't you think it is perfectly okay to write a thesis/dissertation based on CLEC? (I intend to do so.) It may be partial, but how would it be "unrealiable"?
Why isn't it necessary to build a big corpus based on a single L1? I think it absolutely necessary. At least it helps us determine the role of L1 transfer in error analysis.
 
回复: [讨论]CLEC的瑕疵都有哪些呢?

本人最近也在用CLEC , 和楼主一样,在抽取更小的子语料库时发现了一个问题。

ST4里有CET BAND4 的内容, 按理说ST4 都是六级的作文,为什么会有band 4出现,而且还不是一两篇。

例子同 楼主所举用来说明ST4里“有的文章没有标注ST4"之例。


小生愚见,望大家指教。
 
回复: [讨论]CLEC的瑕疵都有哪些呢?

我听说过,在有些年份的四六级考试里,好像作文题是一样的。

请了解四六级的人帮忙确认一下。
 
回复: [讨论]CLEC的瑕疵都有哪些呢?

我听说过,在有些年份的四六级考试里,好像作文题是一样的。

请了解四六级的人帮忙确认一下。

一点不错.以前的46级都是同一时间(上午)考试, 记忆中的90年代中后期不知是00年代前期监考时, 曾数年发现这个有趣的现象.有兴趣的人可以查一查.
 
回复: [讨论]CLEC的瑕疵都有哪些呢?

10楼
大家互相尊重,小辈当然应该尊重长辈,长辈也不要忽悠小辈

说“长辈也不要忽悠小辈”可能言重了――我想杨教授和桂教授肯定是想把该语料库作完美的。
出现的一些问题,正好是我们小辈、后学继续努力的地方。
[本贴已被 作者 于 2006年03月17日 13时51分57秒 编辑过]

这个贴子能灌起来挺有意思.

老洪眼很毒,"完美"很入耳. 今虽物是人非,但窃以为忽悠之才情亦是与生俱来,与时俱进滴:D
哈哈,本人这里又湖盐烂渔了,哪里的事奥?:D
 
回复: [讨论]CLEC的瑕疵都有哪些呢?

另外还想问大家一件事,CLEC中作文评分的满分是多少? 15 还是20? 我检索到一篇 <score 18> 的文章,15分以上有且仅有这一篇,不知是怎么回事?还望大家指点。
 
回复: [讨论]CLEC的瑕疵都有哪些呢?

另外还想问大家一件事,CLEC中作文评分的满分是多少? 15 还是20? 我检索到一篇 <score 18> 的文章,15分以上有且仅有这一篇,不知是怎么回事?还望大家指点。

旧版是15分.新版不详.
 
回复: [讨论]CLEC的瑕疵都有哪些呢?

关于clec,我是有着数不清的感激的:首先是我的硕士研究生导师本人也参与了该项目的研究;然后是,我的导师给了我一本clec;现在硕士毕业已经半载,仍然感激导师,感激clec. clec里很多作文当时都是人工标注,错误在所难免。另外,该语料库是中国学者语料库研究的第一步成果,很多想法难免不够成熟。
 
Back
顶部