RE: Corpora: Chomsky/Harris - one more fun question.

From: James L. Fidelholtz (jfidel@siu.buap.mx)
Date: Thu Apr 05 2001 - 22:46:28 MET DST

  • Next message: Marshall R. Mayberry: "RE: Corpora: Chomsky/Harris - one more fun question."

    Dear CORPORA:
            (By the way, what do you call it if you don't have *any* yet?).

            OK, I'm interested in corpora, and what they can tell us about
    language. I also feel sorry for Mike Maxwell being the only one to
    defend generative linguistics (pace Carl Mills, who, along with Mike
    and a couple of others, have injected a bit of reason into the
    discussion, it seems to me -- by the way, Carl, from the very
    beginning, part of the sport of generative grammar was shooting down
    people's examples, which we never considered was tantamount to
    shooting down their theories -- I think even then we talked about
    'refining theories' and such).
            I studied at MIT, and almost my first research in linguistics
    was on English vowel reduction as a research assistant to Chomsky and
    Halle, during which I discovered the 'frequency rule' for
    pre-heavy-cluster vowel reduction in English, and although I perhaps did
    not manage to convince Chomsky and Halle that this was worth their
    taking into account in SPE, I've been a confirmed frequency buff ever
    since.
            While I was there, Stan Petrick, Barbara Hall (now Partee) and
    various others took part in a project for, I believe, the MITRE
    Corporation, in which they designed a question-and-answer system in
    English for, if memory serves, searching databases. I somehow doubt if
    this was the very first such system, but it must have been
    state-of-the-art for then (early 60s), and certainly made a lot of use
    of Chomsky's (and of course their own) work on English syntax. Also, as
    Fritz Newmeyer has pointed out, a very large portion of early theses at
    MIT (in the 60s, including mine) were fieldwork theses, often on
    indigenous, or at least non-Indo-European, languages. I believe Petrick
    used his MITRE experience as a springboard for his thesis, which I
    believe was on such a system.
            The point here is that not all MIT-trained linguists are averse
    to data (of different types, even), nor even averse to working with
    corpora. This sort of fake dichotomy must have gotten started from the
    (correct) perception that Chomsky has very little personal interest in
    the application of his theories in any practical pursuits, which seems
    to aggravate a large number of linguists, especially if they, for
    whatever reason, are not adherents of generative theories. My answer to
    these people would be: give the guy a break! He has other interests,
    and has done quite well, thank you, in pursuing them and in giving what
    nearly all observers admit are the underpinnings of modern linguistics,
    pretty much independent of the theory or approach one uses. Chomsky
    certainly has no objection to people using his theories (or even
    others) in any number of practical ways. *He* just isn't interested in
    doing so. He'd probably even be interested if some corpus studies
    proved relevant for linguistic theory, but that's up to corpus linguists
    to do, after all. Very few people criticized Michael Jordan for being a
    rather mediocre baseball player (although some criticized him for even
    trying it! -- and they may have been right). You guys are all
    smart--you get the point.
            In sum, to get the attention of the 'MIT linguists', corpus
    linguistics has to show that it is relevant to the formulation of
    theories. Probably very few of the MIT group would dismiss corpus
    evidence out of hand, but they've got other fish to fry than puttering
    around in corpora like we do.
            I guess I'll close with a limerick (oxymoron: it's indelicate):

    There once was a guy from Byzondum
    Used a dried hedgehog skin for a condom.
    His girlfriend would shout,
    As he pulled the thing out,
    "De gustibus non disputandum".

                    Jim

    -- 
    James L. Fidelholtz			e-mail: jfidel@siu.buap.mx
    Posgrado en Ciencias del Lenguaje	tel.: +(52-2)229-5500 x5705
    Instituto de Ciencias Sociales y Humanidades	fax: +(01-2) 229-5681
    Benemérita Universidad Autónoma de Puebla, MÉXICO
    



    This archive was generated by hypermail 2b29 : Thu Apr 05 2001 - 22:50:23 MET DST