Corpora: Chomsky and corpus linguistics

From: ramesh@clg.bham.ac.uk
Date: Sat Apr 28 2001 - 01:46:44 MET DST

  • Next message: James L. Fidelholtz: "Re: Corpora: Chomsky and corpus linguistics"

    Mike Maxwell writes:
    >Putting this differently, there really are things that are *not*
    >possible sentences in English, even though we sometimes know immediately
    >what they would mean if they were grammatical.

    You change the criterion from "possible" in the first clause to
    "grammatical" in the second one. Many instances in corpus data
    may be ungrammatical, but are necessarily possible, because they have
    actually occurred. But the universes of grammaticality and
    possibility do not share the same circumference.

    Mike Maxwell writes:
    >And there are things that we know aren't English, unless
    >we twiddle the grammar a bit. My favorite example is the sentence from
    >Catch-22, "They disappeared him." As one of the characters says in the
    >novel says, it's not English, but...

    So now, "English" = "grammar".
    It is English, it's the grammar that is inadequate.

    cf. Michael Halliday (1993):
    "The Chomskyan position on induction is closely related to the langue-parole
    and competence-performance distinctions. But what such frequency data make
    very clear is the ultimate inseparability of system and use."

    cf. Robert de Beaugrande: Large Corpus Linguistics and Applied Linguistics:
    Dedicating new Bridges:
    "So to discover the `deeper' or `underlying' order of language (called
    `langue', `competence', `deep structure', `I-language', etc.) linguistics
    should *take it back out of use* (called `langage/parole', `performance',
    `surface structure', `E-language', etc.). Yet doing so in effect tends to
    *replace language* with *ideal language* which exists nowhere except in
    some `linguistic theory', although it is boldly offered as an `explanation'
    of `language' in general and of much else besides, such as `human language
    acquisition' (Beaugrande 1997b, 1998a and 1998b)."
    [see http://www.beaugrande.com/]

    Mike Maxwell writes:
    >True, but a description =3D\=3D an explanation. Generative linguistics
    >is trying to find an explanation. Whether you believe they have (or
    >ever will) is of course another question; but at least by your
    >(Ramesh's) description, corpus linguistics isn't even trying to find an
    >explanation (unless you believe that our brains are HMMs or something).

    Sorry, something got lost in the email system at this point. I don't know
    what "=3D\=3D" was meant to be...
    Anyway, I did not say that corpus linguistics "isn't even trying to find
    an explanation". Surely description (or at least a methodology/apparatus for
    description) has to precede (a methodology/apparatus for) explanation ?
    And the better the description, the more robust the explanation can be.

    A bottom-up methodology will necessarily take longer to arrive at
    high-level abstractions, whereas a top-down methodology starts with them.
    This is why it is easiest to criticize top-down methods by criticizing
    the examples they choose to work with.

    The explanation which corpus linguistics eventually arrives at will have to
    include factors beyond formal grammar, and even non-linguistic factors, as
    these affect the situations and contexts in which the corpus instances were
    produced.

    You seem to want to squeeze language until it conforms to your grammar,
    rejecting any instances of language that cannot be so squeezed by calling
    them "ungrammatical" or worse still "not English",
    whereas I want to describe language in terms of grammar and other
    linguistic systems, amending the systems wherever necessary so that they
    can include the vast majority of the corpus data (the actual and the
    probable which it predicts), but allowing that some small proportion of the
    data may remain beyond the descriptive or explanatory scope/power of these
    systems.

    Best
    Ramesh

    Ramesh Krishnamurthy
    Consultant, COBUILD, Collins Dictionaries and Bank of English corpus
    Honorary Research Fellow, University of Birmingham
    Honorary Research Fellow, University of Wolverhampton



    This archive was generated by hypermail 2b29 : Sat Apr 28 2001 - 01:40:05 MET DST