Corpora: Foundations

Gordon and Pam Cain (gpcain@rivernet.com.au)
Sat, 24 Jul 1999 17:47:00 +1000

Dear all--

Pardon the time lag, but I've been on holidays and such.

I don't want to return to our thread several weeks ago about
prepositions + relative who/whom. Rather, shift topics.

Toward the end of that thread, this was posted:

Douglas McCarthy wrote:
>
> Given that the name of this list is "corpora", I'm a little surprised at
> the amount of introspective speculation which we've seen in this thread.
>
> Perhaps we ought to bear two things in mind here:
>
> 1. differentiation of language varieties (a virtually endless task)
>
> 2. design of neutral elicitation of people's attitudes to language use.
>
> In answer to the original question: yes, the forms quoted are highly
> possible today. That is about as far as I would like to go in the
> context of a list devoted to corpora.
>
> Best regards,
>
> Douglas McCarthy>

I am not a professional corpus linguist, but I do have a significant
interest in the area, and I enjoy much of our discussion on this list
and much of the fruit of corpus studies.

I agree with Douglas that the forms quoted are highly possible, and may
well have been intentionally uttered against what many would think 'more
correct' usage.

However, should that end the matter for a corpus linguist? I think not:

1. It seems to me that if we accept anything and everything that may
turn up on a trawl through a corpus, then we will end up including
utterances made in error, made under false understandings of semantic
content, made idiosyncratically, made in archaic form, or that are the
result of typographical errors. Thus it rather strikes me that there is
always the need for human, subjective judgement in the results of any
corpus.

2. As to the matter of language change, or differences of opinion as to
what is grammatical (eg, our prep + who/whom discussion) this gets a bit
more tricky. Certainly I say that these utterances occur with
regularity. But does that make them equally acceptable English in every
case with the 'more correct' forms? If I am teaching my students
academic English, I will warn them off such utterances, as they will be
view\ed as sub-standard and will hurt their mark most likely. Clearly
what occurs is not always what is acceptable.

3. Lexicographers come up against this all the time. Sinclair in one of
his articles (reprinted in _Corpus, Concordance, Collocation_), looks
over the variety of utterances using 'yield', and in the end (I believe)
tosses one aside as idiosyncratic and irregular. With good cause I say.

To return to the question: prep+ who/whom is a borderline case -- as are
perhaps a great many cases -- but shouldn't proper corpus linguistics
acknowledge the need for human judgement, to prevent blind empirical
methods from returning unique results that the original speaker
him/herself would reject in hindsight?

Just thought such a fundamental question should be tossed around in such
a forum as this. Any takers?

Happy corpus-ing!

Gordon

-- 
Gordon Cain, Teacher of ESOL
TAFE International Education Centre, Liverpool
Sydney, Australia
gpcain@rivernet.com.au