Re: Corpora: Chomsky/Harris

From: Michael Barlow (barlow@ruf.rice.edu)
Date: Mon Apr 02 2001 - 04:54:23 MET DST

  • Next message: Stefan Th. Gries: "Re: Corpora: Chomsky/Harris"

    As I remember Chomsky's writings, his initial pronouncements admitted the
    usefulness of corpora, but his later (early) writings were more critical.
    He was certainly critical of the notion that frequency was of interest to
    linguists --- although this is actually one area in which there has been
    some "convergence" in recent years. It seems to me that these days anyone
    can refer to frequency as a factor in grammatical systems without being
    thought of as a raving empiricist.

    Below are some Sunday evening thoughts on the relation of theoretical
    (i.e., generative) linguistics and corpus linguistics. I must admit that
    over the last year or so I have been grappling with the problem of what
    generativity means within corpus linguistics --- and I have to say that I
    don't have a good answer to that question. (I think the answer has
    something to do with blending (a la Fauconnier and Turner), but I have to
    admit that this notion could be stretched to cover anything.) Generative
    theory might be what I need, but the problem is that the word or lexical
    category (with the odd fixed chunk thrown in) is taken to be the basic
    combinatoric unit. This reminds me that one thing Chomsky did was shift
    attention from local dependencies (and Markov models) to long-distance
    dependencies, and I guess I am still stuck wondering how to capture local
    dependencies in a grammatical description.

    I got sidetracked. Some thoughts:

    1. Chomskyan approaches: I see a large gulf between Chomskyan theory and
    Corpus Linguistics. I suppose that some might go for a division of labour
    in which a UG-based system is applicable for basic (core) grammar
    acquisition, while a usage-based learning system is seen as most
    appropriate for peripheral grammar. Rather perversely, I think of UG
    theory as a source of interesting data patterns rather than a source of
    theoretical underpinnings.

    2. West-coast generative approaches: In my view the richness of
    description in West Coast theories (HPSG, LFG and Construction Grammar)
    makes them better candidates as "corpus-ready" frameworks. Proponents of
    these theories would probably say that these theories don't even need to
    be extended at all. Idiom chunks such as "take advantage of" are treated
    using the basic machinery of HPSG and naturally Construction Grammar can
    handle constructions.

    3. Langacker's Cognitive Grammar. Since this is a "maximalist",
    "non-reductive", "bottom-up" approach to grammatical description, it lends
    itself well to corpus approaches. I can see a corpus
    linguistics theory being built on a Cognitive Grammar framework, but I
    know that others disagree with this.

    4. Probabilistic approaches. Rens Bod led a well-attended "workshop" on
    Probability Theory in Linguistics at the last LSA meeting in Washington
    DC. It will be interesting to see exactly how stochastic models are
    incorporated into generative and corpus-based approaches to grammar over
    the rest of the decade. I have not yet read Rens' book "Beyond Grammar"
    and so I won't say anymore on this topic.

    Michael
    ----------------------------------------------------------------------
    Michael Barlow, Department of Linguistics, Rice University
    barlow@rice.edu www.ruf.rice.edu/~barlow
    Athelstan barlow@athel.com www.athel.com (U.S.) www.athelstan.com (UK)



    This archive was generated by hypermail 2b29 : Mon Apr 02 2001 - 04:50:24 MET DST