RE: Corpora: Chomsky and corpus linguistics

From: John A. Goldsmith (ja-goldsmith@uchicago.edu)
Date: Sat Apr 28 2001 - 02:57:00 MET DST

  • Next message: ramesh@clg.bham.ac.uk: "Corpora: Chomsky and corpus linguistics"

    Ramesh wrote:
    >>"What is possible" seems to require a binary
    >>yes/no type of answer, "what is probable" suggests
    >>a cline or spectrum. Language is a part of human
    >>behaviour, and almost everything seems to be possible
    >>within human behaviour.

    Mike Maxwell wrote in reply:
    >But that's the point of introspective grammatical judgements: they are
    >binary, and *not* everything is possible.... Putting this differently,
    there >really are things that are *not* possible sentences in English, even
    though
    >we sometimes know immediately what they would mean if they were
    grammatical. >"Whose did you find book?", "What are you afraid that
    happened?", "Who do you
    >wonder whether will go?" etc. And there are things that we know aren't
    >English, unless we twiddle the grammar a bit. My favorite example is the
    >sentence from Catch-22, "They disappeared him." As one of the characters
    >says in the novel says, it's not English, but...

    Speaking as someone who not only believed, but _taught_, what Mike Maxwell
    says, and no longer does, I would offer the remark: there is little
    or no convincing evidence that there is a fundamental divide between
    grammatical and ungrammatical sentences; what distinguishes those people
    who believe there is such a divide from those people who do not believe
    that there is such a divide, is this: just those beliefs. Beliefs, opinions,
    preferences and aesthetics. And those who do not believe it feel
    (but now with rational grounds) that their own
    inferences and conclusions and language are more robust and less likely
    to be based on faulty premises.

    By the way, I think it would be a great mistake (even if we did believe
    in a grammar that draw a sharp in/out distinction) to put "they disappeared
    him" in the Out group! ... and thus it goes, for those who feel obliged
    to make such decisions.

    Mike Maxwell wrote:
    >True, but a description =\= an explanation. Generative linguistics is
    trying
    >to find an explanation. Whether you believe they have (or ever will) is of
    >course another question; but at least by your (Ramesh's) description,
    corpus
    >linguistics isn't even trying to find an explanation (unless you believe
    that
    >our brains are HMMs or something).

    I won't speak for corpus linguistics, but I hope it is clear to all
    concerned that a perfectly respectable scientific theory of language (even
    possessed of the right to say that it provides an _explanation_) can be
    based
    on the statement that the goal of the analysis is to provide
    a probability distribution over V* (where V is the vocabulary of the
    language), i.e., possible strings of words. One can in turn judge
    between such distributions by seeing what probabilities they assign
    to actual, existing corpora: the theory that assigns the highest
    probability wins. (I gloss over issues of theory description length,
    of course).

    John Goldsmith



    This archive was generated by hypermail 2b29 : Sat Apr 28 2001 - 00:48:14 MET DST