According to Kennedy, a corpus is a body of written text or transcribed speech which can serve as a basis for linguistic analysis and description.
here is a keyword: body or corspe
What shall we concentrate on?
Then we come back to a classic question: parole or langue; performance or competence.
The difference between dead and alive, I think, is equal to the contrast in reading Shakespeare and waching Shakespeare.
Now please study the following conversation:
A: I still have a son.
B: Well, that's OK!
A: I still have a dog.
B: Oh, I'm sorry!
Even though the words are easy, out of the real situation, we need to guess or work out the meaning of the speakers.
However, if you watch or hear the conversation in person, or you know more information about the speakers, it's easy to understand.
Conversation through QQ is more problematic than common dialogues. And I think you know the reason.
"Starred examples" here may refer to the examples quoted by a linguist in his research paper. The examples are labeled by stars (as used in linguistics) to indicate their unacceptability testified by a native speaker's intuition instead of eviden from a corpus.
Any research approach has its limitation and strengths.
It is illusion to expect one approach to provide the comprehensive picture of language.
The key point is that we are approaching language from different ways.
(3) The best information comes from direct data.
This points to the alternative of pure unadulterated texts, devoid of any annotation. Yet we still do not know how to properly handle them. The other alternative, annotation, now offered as the solution, is, to a varying extent, always biased and adulterates both the data input and results obtained. Hence, it should always be viewed as an alternative only.