Is corpus linguistics an independent discipline?


Staff member
In the corpus field, there is a debate over whether corpus linguistics is a methodology or an independent branch of linguistics. What's your opinion?
it is more like an approach to the language study. it adopts a lot of statistical methods as well as the aid of computers that are available to analyze the language from the quantitive perspective. yet the interpreattion of thenumerical results of the analysis still needs the sub-disciplines of linguistics, such as phonology, socioliguistics, before they generate the substantial as well as practical know-how.
corpus approach is unique in that it enables language, edited or natural, to take on "characteristics" of its own in the real-time communication, which is made possible by the tagging and the other devices that are availble in the computer technology. The "characteristics" are siginificant yet often missing in language analysis prior to the arrival of corpus linguistics, since at that time linguists often relied on intuition and introspection to analyze with the aid of sheer principles. With the descending of corpus,sociolinguistics and discourse analysis the study of language is endowed with the enormous opportunities of studying the language as it is like in the social communication, since the linguistic specimen has been thereupon taking on a lot of "characetriscs" that help the linguistics to have a clearer picture of how it is used in reality. In a way, if the target of study of language is like a continuum that has two ends, i.e. "dry" language and "living" one now the study of language has been approaching a step further to the possibility of analysis of "living" language, though there are still a lot of unsettled issues like representativeness of a corpus and synchronocity of a corpus.
2.0 语料库语言学是不是独立的新兴学科?

2.1 语料库语言学是一种理论架构
完全赞成语料库语言学是一种理论架构的几乎没有。只是某些学者比较强调语料库语言学的理论意义。比如,Halliday(1991;1992;1993)指出,语料库语言学作为一种理论架构(theoretical construct),将语料收集和理论概括统一了起来,从而使我们对语言的理解产生一种质变。这种新的理论架构有助于考察同时作为系统和实例(instance)的语言的本质。因为在Halliday的语言学思想当中,实际话语是语言系统的实例再现(instantiation)。而语言系统,或者说是语法体系是一种统计概率上(probabilistic)的自然结果。这一思想与所谓语言学规则是演进特征(emergent properties)的说法颇为暗合(李平,2002)。也就是说,因为严格设计并创建的语料库所包含的应该是真实文本和真实话语,其中语言实例在出现频率上的优势即是对其背后语法体系的概率体现。另外,我们知道Halliday功能主义思想中的一个重要概念就是“意义的选择”(Halliday, 1985)。这种意义的选择反映了语言运作的内在机制。语料库辅之以计算工具,便可以将这些机制进行抽象概括从而形成语法。
这里特别值得一提的是,上述思想是与Chomsky的心灵主义相对立的。Chomsky历来认为语言是一种天赋能力,而自然语料都是杂乱无章的。其中包括很多显然不会出现的,或者错误的句子,还有很多诸如迟疑,注意力的不集中和外界的干扰等等。所以他主张我们研究的应该是理想的听话人/说话人的语言能力(Chomsky, 1965)。因而Chomsky提倡通过内省和诱发的手段来获得语言资料,而反对使用语料库进行语言研究的。

2.2 语料库语言学是一种基于语料库的研究方法
比如,Leech(1992)说过,“……[语料库语言学]倒是更应该被看作是从事语言研究的一种方法论基础。理论上(而且常常在实践当中)语料库语言学与其他语言学分支轻松结合:我们能够借助语料库研究语音学,句法……。”(p. 105)
“语料库语言学不仅界定了一种研究语言的方法论,……而且事实上界定了该项研究课题的一些哲学/理论视角。”(pp. 105-6)
综上所述我们认为,基于语料库的研究方法(corpus-based approach)这一提法倒是更能准确地反映语料库语言学的性质和定位。

TB2001 is actually not so weak in her claim as quoted below. In fact, she argues that corpus linguistics ‘goes well beyond this methodological role’ and has become an independent ‘discipline’ (Tognini-Bonelli 2001: 1). But this view is not shared by all corpus linguists.

While we agree that corpus linguistics is ‘really a domain of research’ and ‘has become a new research enterprise and a new philosophical approach to linguistic enquiry’ (ibid), we maintain that corpus linguistics is indeed a methodology rather than an independent branch of linguistics in the same sense as phonetics, syntax, semantics or pragmatics. These latter areas of linguistics describe, or explain, a certain aspect of language use. Corpus linguistics, in contrast, is not restricted to a particular aspect of language. Rather, it can be employed to explore almost any area of linguistic research. Hence, syntax can be studied using a corpus-based or non-corpus-based approach; similarly, we have corpus semantics and non-corpus semantics.

Central to Tognini-Bonelli’s argument is what we view as a confused understanding of ‘rules or pieces of knowledge’ in her definition of methodology:

"While a methodology can be defined as the use of a given set of rules or pieces of knowledge in a certain situation, by ‘pre-application’ we mean that, unlike other applications that start by accepting certain facts as given, corpus linguistics is in a position to define its own sets of rules and pieces of knowledge before they are applied." (ibid: 1)

Most dictionaries and research manuals define a methodology as a system of methods and principles of doing something, for example, for teaching or carrying out research, a definition similar to Tognini-Bonelli’s. In this definition, the methods and principles, or in Tognini-Bonelli’s terms, ‘rules or pieces of knowledge’ in the first instance in the above citation, are associated with doing something. In corpus linguistics, for example, they can refer to how to build and/or explore a corpus, and how to interpret quantitative data. In the second instance, however, ‘rules and pieces of knowledge’ (i.e. ‘its own sets of rules and pieces of knowledge’) are unmistakeably associated with a certain aspect of language use under investigation rather than doing something, as was the case in the first instance. As using corpora often reveals facts about language use which introspection alone cannot easily, if at all, provide, corpus linguistics is in a position define ‘its own sets of rules and pieces of knowledge’ about language. In this sense, corpus linguistics is indeed a ‘pre-application methodology’ as Tognini-Bonelli suggests (The ‘pre-application’ aspect of corpus use is sometimes referred to as the ‘corpus-driven approach’, in contrast with the ‘corpus-based approach’, Tognini-Bonelli 2001: 65-100).

To foreground corpus linguistics, Tognini-Bonelli appears to have downplayed ‘other partner disciplines under the same umbrella’ of applied linguistics such as stylistics and translation studies, implying that because corpus linguistics has the ‘pre-application’ advantage, it should enjoy higher priority over other partner disciplines and should be identified as an independent branch of linguistics. However, as noted earlier in this section, stylistics and translation studies can be either corpus-based or non-corpus-based. In a way they have a greater freedom than corpus linguistics.

As corpus linguistics is a whole system of methods and principles of how to apply corpora in language studies and teaching/learning, it certainly has a theoretical status. Yet theoretical status is not theory itself. The qualitative methodology used in social sciences also has a theoretical basis and a set of rules relating to, for example, how to conduct an interview, or how to design a questionnaire, yet it is still labelled as a methodology upon which theories may be built. The same is true of corpus linguistics.

With regard to the methodology question, the attempt to construct corpus linguistics as anything other than a methodology ultimately fails. In fact, even those who have strongly argued that corpus linguistics is an independent branch of linguistics have frequently used the terms ‘approach’ and ‘methodology’ to describe corpus linguistics (e.g. Tognini-Bonelli 2001).