Re: Corpora: What is a corpus

From: Mike Scott (lexical@netcomuk.co.uk)
Date: Fri Jan 28 2000 - 11:25:23 MET

  • Next message: Oliver Mason: "Re: Corpora: What is a corpus"

    Lucian Galescu wrote:

    >It strikes me as ironic that corpus linguists would want to prescribe
    >the usage of the word "corpus". Using Oliver's terminology, I would say
    >that all corpora are `filtered'. choosing 13th century texts, or
    >Shakespeare's plays, or conversations with a travel agent, or the Bible,
    >etc, etc., all are ways of filtering the abstract body of language
    >around us for a specific purpose, since they all involve a criterion of
    >what is in and what is out of the corpus.

    I agree with Lucian. If like me you have a text focus in your work, you
    will probably wish to collect a corpus of complete texts, that is language
    events which can be felt to be self-standing. But I wouldn't want to
    restrict the term to that, since most folk do not seem to have a text focus
    but rather a language focus. By that I mean that most folks make claims
    about a language, not about a givenm text or set of texts. It therefore
    seems reasonable for some purposes to collect past tense sentences, and if
    the collection is electronic and esteemed by its creator s/he will be
    likely to prefer "corpus" as a label rather than "list".

    Mike Scott, Liverpool



    This archive was generated by hypermail 2b29 : Fri Jan 28 2000 - 12:20:29 MET