请问哪有BNC Sampler,如何在.byu.edu的BNC中对检索结果进行随机抽样

请问在哪个网址可以免费使用BNC Sampler语料库?在以下这个网页上http://www.lextutor.ca/concordancers/concord_e.html的choose a corpus 选项中选择BNC Written (1million)和BNC Spoken (1million)就是吗?但是检索出的语料并未标注词性,而BNC Sampler是进行了标注的,所以请问以上网页上的BNC Written (1million)和BNC Spoken (1million)语料库是不是就是BNC Sampler?
另外,请问使用http://corpus.byu.edu/上的BNC-BYU,如果检索出的结果太多,可不可以进行随机抽样?如果可以,应该怎么操作呢?
急盼高人指点,多谢!
 
回复: 请问在哪个网址可以免费使用BNC Sampler语料库,如何在http://corpus.byu.edu/上的BNC语料库中对检索结果进行随机抽样?谢谢!

请问在哪个网址可以免费使用BNC Sampler语料库?在以下这个网页上http://www.lextutor.ca/concordancers/concord_e.html的choose a corpus 选项中选择BNC Written (1million)和BNC Spoken (1million)就是吗?但是检索出的语料并未标注词性,而BNC Sampler是进行了标注的,所以请问以上网页上的BNC Written (1million)和BNC Spoken (1million)语料库是不是就是BNC Sampler?
另外,请问使用http://corpus.byu.edu/上的BNC-BYU,如果检索出的结果太多,可不可以进行随机抽样?如果可以,应该怎么操作呢?
急盼高人指点,多谢!

BNC Sampler只是BNC的一个样品库而已,容量小很多。
检索结果太多?抽样的问题要根据你的研究问题来确定。如果检索结果用来做mini texts, 那么抽样是必须的。

抽样怎样操作?用许博士的concordance sampler就可以了。

不过话说回来,要是有个concordancer或sampler要是可以做到不仅仅按照隔多少行取样,还可以做到下面这样就更方便mini texts的建设了:

如在BNC中以 in + succession为检索对象,
得到:
... in rapid succession ... 13
... in slow succession ... 15
... in quick succession ... 10
... in swift succession ... 3
...

我想要得到的抽样结果只是以上结果的例句随机各一句,这样就可以保证既有典型范式,又不遗失不同范式,还剩却了人工取样的繁琐。

点子提出来了,等着许博士、贾博士或者哪位大侠有空时出手整一个啊。:)
 
回复: 请问在哪个网址可以免费使用BNC Sampler语料库,如何在http://corpus.byu.edu/上的BNC语料库中对检索结果进行随机抽样?谢谢!

91Splinter groups follow splinter groups in quick succession.92She sensed that Bonard was aware of it; although his smile did not waver, his eyes registered in quick succession a question, a realisation and a reassurance.93But there was concern that, since‘ it is by no means uncommon for the parties to a fragile marriage to separate and come together again in quick succession on a number of occasions, this would force the courts to enquire into the precise situation at the time of the rape.’94He did not glance in that direction, but started to cross the square, when there were three shots in quick succession.95The pathologist 's secretaries followed one another in bewildering succession.96Consequently by the 1920's factories were being closed in quick succession.97His rapport with the Wagner household comes across vividly from letters to Rohde in the late summer of 1869:" Just recently I 've paid four visits there in quick succession and a letter takes wing in the same direction almost every week";" On the visit before last, during the night, a baby boy called Siegfried was born.98Emerging from the back staircase into the kitchen, Julia found McGee ferrying the remains from the dining room, downing in quick succession the remnants of the Chablis and the dregs of the Margaux.99They were set up in close succession by the Secretary of State for Scotland to study the curriculum( Munn) and assessment( Dunning) in the third and fourth years of Scottish secondary schools; they kept in close touch with each other throughout their deliberations; and they presented their reports with complementary recommendations at the same time.100He led the way along a series of paths, up assorted flights of steps and out across a seemingly limitless expanse of finely mown rugby and hockey pitches that climbed the hillside in stepped succession.101I then received two shocks in quick succession.102Double-click— A mouse procedure where the left-hand mouse button is pressed twice in quick succession.103It is unusual to have two erm 's in immediate succession— I 've been trying to find that— erm— erm— and to have three is very strange: I 've been trying to find that— erm— erm— erm—.104Then he rolled over, pointed the gun at the bulky figure and pulled the trigger twice in rapid succession.105Suddenly, somewhere off to the rear, came an yell of alarm and the roar of a shotgun going off, followed by three more blasting off in quick succession.106In his eagerness to escape the encroaching flames, he worked the action and fired three times in rapid succession.107Tracking the swiftly moving oriental with the muzzle, Jube worked the action and fired twice more in rapid succession.108But these were receding and soon ceased altogether, after a burst of three in rapid succession.109Two more came along in quick succession, 1lb 7oz and 1lb-11, then all activity ceased.110The events of the past week raced past his mind 's eye in a matter of seconds, the mental images flickering before him in rapid succession like a movie trailer.111He feinted twice in rapid succession, right and left, his blade flickering menacingly like the tongue of a venomous snake preparing to strike.112Pictures of a man were flashed upon the screen in rapid succession.113I tried a few more, in quick succession.114Increasingly, the tendency is to work for a large number of companies in rapid succession.115Eight bodies tumbled out, sliding down the rope in rapid succession.116Rory made left and right turns in quick succession, seemed to go back on himself, seemed at times to be driving blind.117The second pitfall only applies if you have a relatively slow VAX and will be starting up two or more LIFESPAN Processes in quick succession.118If the Operator terminal is in use, the Offline System will try to access it several times in quick succession; if it remains unavailable the Offline run will not take place and a message reporting the situation will be sent via the mail system to the Offline Manager.119Moreover, simultaneous oesophageal pressure peaks may occur as a normal phenomenon when swallows are taken in rapid succession, due to the phenomenon of deglutitive inhibition.120In the big chief 's private office Hartley informs me casually that what I 'd taken to be a ping pong ball-firing toy gun on the table behind me was actually an automatic 12-bore shotgun capable of delivering a dozen lethal rounds in quick succession.
 
回复: 请问在哪个网址可以免费使用BNC Sampler语料库,如何在http://corpus.byu.edu/上的BNC语料库中对检索结果进行随机抽样?谢谢!

要是可以做到不仅仅按照隔多少行取样,还可以做到下面这样就更方便mini texts的建设了:

如在BNC中以 in + succession为检索对象,
得到:
... in rapid succession ... 13
... in slow succession ... 15
... in quick succession ... 10
... in swift succession ... 3
...

我想要得到的抽样结果只是以上结果的例句随机各一句,这样就可以保证既有典型范式,又不遗失不同范式,还剩却了人工取样的繁琐。


What you suggested above is not right way of getting results. So long as you can do a random sampling of your concordance lines, the phrases you listed, in principle, should have proportional chances of being sampled.

So try concordance randomizer
http://ishare.iask.sina.com.cn/f/13929531.html
 
回复: 请问在哪个网址可以免费使用BNC Sampler语料库,如何在http://corpus.byu.edu/上的BNC语料库中对检索结果进行随机抽样?谢谢!

What you suggested above is not right way of getting results. So long as you can do a random sampling of your concordance lines, the phrases you listed, in principle, should have proportional chances of being sampled.

So try concordance randomizer
http://ishare.iask.sina.com.cn/f/13929531.html


Thanks for your prompt reply, Dr Xu!
But what I want is not for research but for pedagogic purposes. I just want to find and display examples with different patterns while not having to manually sample the concordance lines and take in account the frequency info. Random sampling will probably skip some patterns.
 
回复: 请问在哪个网址可以免费使用BNC Sampler语料库,如何在http://corpus.byu.edu/上的BNC语料库中对检索结果进行随机抽样?谢谢!

I guess you misunderstood what I meant.
Concordance lines, as well as corpus data, have been sampled from the real-life discourse. Thus, we need to be fully aware that concordance hits are probabilistic the time they were put into the corpus. Subsequent sampling, of selection, from the concordance, is still probabilistic.

The reason that you believe that something is, or might be, 'skipped', is based on your experience (intuition). If you believe in probability, keep believing in it.

If the following results are the end product that you need, the 'collect' feature of PowerGREP yields exactly what you want, sometimes with the help of regular expressions. Same pattern collated and frequencies shown in descending order.
... in rapid succession ... 13
... in slow succession ... 15
... in quick succession ... 10
... in swift succession ... 3

Don't ask me how 'collect' is done please. PowerGREP tutorial explains everything.
Give it a try.
 
回复: 请问在哪个网址可以免费使用BNC Sampler语料库,如何在http://corpus.byu.edu/上的BNC语料库中对检索结果进行随机抽样?谢谢!

谢谢大家及时回复!你们的建议对我很有帮助。
我是想在英语口语和书面语语料库中各检索几百例含有双及物结构(V+NP+NP或者V+NP+PP)的例句,对它们进行分析。
请问用哪个语料库较好呢?如果用http://corpus.byu.edu/上的BNC语料库,由于它区分了spoken, fiction, academic, non-academic, magazine,newspaper这些语体,并不是分为spoken 和written,应该怎么办呢?
另外对于在语料库中检索双及物结构的例句,大家有什么好办法?我看到有研究是从一个小型语料库(20,000words)里找,大家认为也样可不可取?如果可取就得自建两个小的书面语及口语语料库。也有人是输入某些特定的动词(如give)检索,但我认为这样不具有代表性。大家认为呢?
多谢指点!
 
回复: 请问在哪个网址可以免费使用BNC Sampler语料库,如何在http://corpus.byu.edu/上的BNC语料库中对检索结果进行随机抽样?谢谢!

I guess you misunderstood what I meant.
Concordance lines, as well as corpus data, have been sampled from the real-life discourse. Thus, we need to be fully aware that concordance hits are probabilistic the time they were put into the corpus. Subsequent sampling, of selection, from the concordance, is still probabilistic.

The reason that you believe that something is, or might be, 'skipped', is based on your experience (intuition). If you believe in probability, keep believing in it.

If the following results are the end product that you need, the 'collect' feature of PowerGREP yields exactly what you want, sometimes with the help of regular expressions. Same pattern collated and frequencies shown in descending order.
... in rapid succession ... 13
... in slow succession ... 15
... in quick succession ... 10
... in swift succession ... 3

Don't ask me how 'collect' is done please. PowerGREP tutorial explains everything.
Give it a try.


Thanks a lot!
 
Back
顶部