Re: Corpora: Chomsky and corpus linguistics

From: Michael Barlow (barlow@ruf.rice.edu)
Date: Mon Apr 09 2001 - 03:50:39 MET DST

  • Next message: Jean Veronis: "Corpora: Book: Advances in Probabilistic and Other Parsing Technologies"

    In teaching Intro to Linguistics many of us (generative and non-generative
    linguists alike) state that what linguists try to understand is "what we
    know when we know a language." And during the course when students equate
    language with written language we will probably emphasise that spoken
    language is of primary interest to linguists.

    In other words, the "science" of Linguistics (in North America at least)
    is concerned with the cognitive structures associated with spoken
    language.

    The question arises of how to understand the nature of these cognitive
    structures. We can perhaps take note of what psychologists and cognitive
    scientists can tell us about cognitive structure in general. And we can
    also reasonably assume that language cognitive structures are related in
    some reasonably direct way to language performance; and conversely,
    language performance (collected in corpora) provides some of the best
    evidence we have about the nature of cognitive structures.

    This approach (a version of corpus linguistics) seems to me to be as
    theoretical and scientific as any other paradigm in Linguistics.

    > be much more engineers than scientists. Chomsky, OTOH, is a
    > scientist. Sometimes the scientists produce things the engineers can

    Chomsky acts more like a philosopher than any regular scientist. It is
    clear that he is more interested in the structure of arguments and in the
    form of his theories than the relation between the theory and data.
    (Actually, he is interested in the relation between theory and data, but
    not in the way that a scientist is.) I think it is uncontroversial to say
    that over his long career the amount of data that is accounted for by his
    successive theories has diminished, but the bonus for him has been that
    the form of UG has become simpler (in some sense).

    How does he approach the task of explaining the miracle of language
    learning? Mike Tomasello at the Max Planck Institute in Leipsig is
    collecting a massive amount of data on the input that some children
    receive and on the output that they produce. Tomasello is a psychologist;
    he is not going to ignore mental processes or cognitive structures and
    expect to find explanations in the data alone, but he takes the
    scientific stance that we need to know what kind of input the child
    receives. Chomsky, on the other hand, typically argues on the basis of
    logical necessity---as he sees it.

    "Gross observations suffice to establish some qualitative CONCLUSIONS.
    Thus, it is clear that the language each person acquires is a rich and
    complex construction HOPELESSLY UNDERDETERMINED by the fragmentary
    evidence available." (Reflections on Language p10)

    Linguists can make their own choice as to whether the Tomasello approach
    or the Chomsky approach is most likely to produce results, but I don't see
    how adopting the minimal-empirical-data approach can be cast as the more
    scientific of the two.

    (As an aside, I should be explicit in saying that I don't think there is a
    way to evaluate whether a particular research enterprise is worth
    embarking on. The value of a research program can only be judged after
    some research has been done and this is perhaps behind some of the
    prodding of generative linguists to show what their research enterprise
    has led to.)

    Another typical Chomsky quote:

    "Because of the sometimes intricate connections among the various
    subtheories, small changes in the formulation of some principle or notion
    may have large-scale and wide-ranging consequences. Such problems will
    typically arise insofar as we eliminate specific rule systems in favor of
    systems determined by setting parameters of UG. This is naturally a
    positive development, one that is inherent in any serious effort to deepen
    explanatory power, but it also means that theoretical proposals face a far
    more difficult empirical challenge than in earlier work. Furthermore,
    arguments become more intricate as the options for selecting rule systems
    are reduced." (Barriers p2)

    Mike Maxwell presumably sees this step as equivalent to Newton's step
    backward. Again, I would say that it illustrates the view of Chomsky as a
    philosopher. He is pursuing a line of inquiry in which those parts of UG
    which are carrying a descriptive load (i.e. accounting for empirical data)
    are eliminated with the hope that the empirical data can be accounted for
    in different ways (i.e., by more "explanatory" systems). I don't see this
    as particularly scientific. Chomsky will not be bothered by a loss of
    empirical coverage because he puts great store in a particular form of
    theory, one that is minimal, parsimonious and highly deductive. There is
    nothing particularly scientific about his stance given that there is no
    evidence to suggest that language cognitive structures are minimal,
    parsimonious and highly deductive.

    I believe that Chomskyan linguistics is not so much an idealisation as an
    untestable theory. Each of the components comes in a variety of versions
    with the result that it is impossible to judge the whole. The empirical
    data used consists of grammaticality judgements, which are themselves in
    need of investigation to determine what the relation is exactly between
    grammaticality judgements and language cognitive structures. Finally, the
    only evaluation criterion for the theory is the form of the theory itself
    (It must not be too descriptive).

    What if Chomsky provided the answer to the miracle of language learning?
    Would that give us the theory we needed to understand what we know when we
    know a language? No. We would only know how we start to learn a language.

    Michael
    ----------------------------------------------------------------------
    Michael Barlow, Department of Linguistics, Rice University
    barlow@rice.edu www.ruf.rice.edu/~barlow
    Athelstan barlow@athel.com www.athel.com (U.S.) www.athelstan.com (UK)



    This archive was generated by hypermail 2b29 : Mon Apr 09 2001 - 03:46:35 MET DST