RE: Corpora: when does a subcorpus become a corpus

From: Sampo Nevalainen (samponev@cc.joensuu.fi)
Date: Fri Jan 04 2002 - 10:58:50 MET

  • Next message: P bI K O B_ B.B.: "RE: Corpora: when does a subcorpus become a corpus"

    Well I guess I tried to focus on the issue of representativeness rather
    than the proper nomination for the set of texts, but, yes, probably the
    proper term might be 'special purpose corpus'. This, however, raises
    another interesting question. I personally would hope that every single
    corpus had been compiled for a particular purpose. Indeed, I wonder if
    there really IS such thing as a 'general corpus'? I have a feeling that so
    called 'general corpora' - if they exist - are pretty useless in general,
    unless they're modified for a particular purpose or task. I suppose that in
    empirical research you always have to choose your "object" (material)
    according to your subject, and not to use "just something", i.e. you have
    to know your material: I guess no one would try to determine the average
    height of human beings on the basis of a basketball team. The problem with
    language is that exceptions are often not evident and not easily detected
    since there is no clear "reference set" for language. In principle, if your
    findings are truly generalizable you should get similar results from any
    corpus, although there is obviously more "noise" in more "general" corpora.
    Am I right? Or am I pedant? Or both. ( About the "Terms in Context" - which
    I do have read more than up to p. 45 :-) -, I liked the book, and I think I
    could make use of some chapters in my course on corpora as translation tools. )

    sincerely,
    Sampo

    At 09:54 4.1.2002 +0100, Pearson, Jennifer wrote:
    >If you look at the same publication, p.48, you will find that I argue that,
    >given Sinclair's definitions, neither the term subcorpus nor the term
    >component is appropriate for the sets of texts I was working with (and
    >probably not for the EAP texts referred to in previous e-mails either). I
    >chose therefore to use the term special purpose corpus, "a corpus whose
    >composition is determined by the precise purpose for which it is to be used.
    >While a special purpose corpus may be derived from a general reference
    >corpus or from a monitor corpus it will not constitute a subcorpus in the
    >sense defined by Sinclair because it will not have all of the properties of
    >a larger corpus." I coined this particular term for two reasons, a) because
    >the language of the texts I was working with could be classified as
    >'language for special purposes' or 'LSP', two terms that already existed in
    >applied linguistics to designate, for example, the language of business, the
    >language of medicine, the language of economics, and b) because the term
    >'special purpose corpus' implies that the corpus has been compiled for a
    >particular purpose.
    >Wishing you all a happy new year
    >Jennifer
    >
    >Dr Jennifer Pearson
    >Chief of Translation
    >UNESCO
    >7 Place de Fontenoy
    >75352 Paris 07
    >Tel:. 00 33 1 456 80 780
    >e-mail: j.pearson@unesco.org
    >http://www.unesco.org



    This archive was generated by hypermail 2b29 : Fri Jan 04 2002 - 11:15:22 MET