Re: Corpora: MWUs and frequency

Przemyslaw KASZUBSKI (przemka@main.amu.edu.pl)
Fri, 9 Oct 1998 13:00:30 +0000

On 9 Oct 98 at 9:31, Oliver Mason wrote:

> So this would then qualify it as `familiar' for readers of the `Sun', but
> unfamiliar for readers of the Economist...

I am not much of a statistician or even a corpus linguist, but to me
the answer is different: 'berserk' is a kind of word that WRITERS
contributing to the Sun THINK fits the requirements of the
paper's readership as well as the Sun's style/image etc. This
translates into genres and textypes but not necessarily into
'famimiarity'. The readers of the Economist probably know this word
very well (yes, it's a hunch that doesn't count). On the other hand,
some of the vocabulary from the Economist may be 'unfamiliar' to SOME
readers of the Sun. But corpora alone are of no help with this.

I fully agree with Mike Scott that corpora allow us to be make
text-oriented generalisations rather than statements about ' (a)
language'. On the other hand, we KNOW there are some properties of
language, in vocabulary too, that cut across those textual
distinctions, that are more 'universal'. Therefore, I also agree with
Adam's point of the usefulness of comparing corpora. A lost cause?

Przemek Kaszubski

==========================================
Przemyslaw Kaszubski, M.A.
przemka@amu.edu.pl
http://elex.amu.edu.pl/ifa/skaszub.htm

MY (ENGLISH) (LEARNER) CORPORA PAGE:
http://main.amu.edu.pl/~przemka

School of English
Adam Mickiewicz University
Al. Niepodleglosci 4
61-874 Poznan, POLAND
tel: +48 61 8528820
fax: +48 61 8523103
=========================================