Re: Corpora: Chomsky/Harris - one more fun question.

From: Mike Maxwell (mike_maxwell@sil.org)
Date: Thu Apr 05 2001 - 16:17:13 MET DST

  • Next message: Steven Bird: "Corpora: IRCS Workshop on Linguistic Databases"

    Pete wrote:
    >The point is, the ornithologists are not the
    >theorists of flight, the physicists are. Bird
    >flight is a natural embodiment of the physics of flight

    Rather than annoy the list further with an argument about who studies the
    way birds fly, or discussion of what is "the" correct analogy, I will just
    say that my analogy is
        bird flight : airplane design ::
            theoretical linguistics : language technology
    You may learn something about language that you can use on the computer from
    theoretical linguists, just as the Wright brothers are said to have learned
    from watching birds fly. But you shouldn't fault the linguistic
    theoreticians just because you can't make use of everything they do, any
    more than you should fault astronomers for not helping you build a fusion
    reactor, or ethologists for not teaching you how to train your dog, or
    neurologists for not finding better ways to do classroom teaching--or
    ornithologists for studying the shape of bird feathers.

    >Do you ever hear psycholinguists and language
    >engineers say 'those guys at MIT have really
    >clarified the fundamental relationships between
    >sounds, texts and meanings, now all we have to
    >do is model them in wetware and software?'

    I can't speak for the psycholinguists (although I very much doubt that
    psycholinguistics would be where it is were it not for generative
    linguistics). But I will repeat what I said in my earlier posting: when I,
    as a "language engineer" (not my term, nor my title then), wrote a
    (reasonably) comprehensive grammar of English for a parser, I did refer to
    "those guys at MIT", as well as to generative linguists elsewhere. I also
    used a traditional grammar of English (Quirk, Greenbaum, Leech and
    Svartvik), although if you look at the revisions in that work over the
    years, I think you'll find that they also owe a debt to generative
    linguistics. (I could be wrong about that, since it's always hazardous to
    try to guess how someone came to a conclusion.) My memory is getting fuzzy
    now that I'm over the hill, but as I recall, the single best reference was
    Joe Emond's book "A Transformational Approach to English Syntax." (BTW, we
    later did a weighting of the various constructions based on their frequency
    of occurrence in various corpora, to help choose the best parse. To my
    mind, this is an ideal symbiosis between theoretical and corpus linguistics:
    find what's possible from the theory, and filter by what's likely in the
    corpus.)

    At one point (back in the '80s) I played at doing English morphology on the
    computer. I needed to know where the stressed syllables were, and I
    basically implemented the stress rules in The Sound Pattern of English
    (Chomsky and Halle). And when I later implemented a general phonological/
    morphological parser, I again referred to the work of generative
    phonologists and morphologists, including some at MIT, as well as to theses
    done at MIT (and elsewhere). (Oops, I hope we don't get off on what the
    plural of 'thesis' is!)

    I guess I'll make this msg even longer by replying to S. Warren:
    >Maybe 50 years ago before the advent of
    >computers as we know them today a non-
    >empirical approach could be justified but
    >surely not now? Generative linguists please
    >reply in your defence!!!!

    There are a number of ways to "attack" this (I'd rather be on the attack
    today than on the defence :-)). One is to say that there are degrees of
    empiricism, and that the data that generative linguists typically use is not
    non-empirical just because they think of it, rather than finding it in some
    half-baked email msg from someone about whom you know nothing, and who may
    not even be a native speaker of the language they're writing in. Another
    attack would be to say that one should use a variety of data, and that a
    sentence I (as a generative linguist) think of may be more relevant because
    I can tailor it to my needs. (We don't fault chemists because they mix up
    their own chemicals, rather than studying only reactions that occur in the
    enviroment around them.) I could go on, but I'll stop there for today,
    because I have to go to a class on SQL server...

                                     Mike Maxwell
                                     Summer Institute of Linguistics
                                     Mike_Maxwell@sil.org



    This archive was generated by hypermail 2b29 : Thu Apr 05 2001 - 16:13:41 MET DST