Re: Corpora: overuse and underuse of learner English

From: Robert Bley-Vroman (vroman@hawaii.edu)
Date: Wed Dec 12 2001 - 00:44:21 MET

  • Next message: Eric Atwell: "Re: Corpora: overuse and underuse of learner English"

    At 8:28 AM -1000 12/11/01, xiaotian guo wrote:

    >It is unavoidable to touch overuse and
    >underuse in the study of corpora comparison. But to what extend does the
    >difference of a certain figure reach when we can say overuse or underuse
    >occurs (I am poor in statistics)?

    The obvious simple thing is to develop some measurement of rate-of-use.
    Normally, this would be a proportion (e.g. 20% of the verbs are present
    tense in native-speaker corpus whereas 40% a present tense in learner
    corpora). A simple statistic you could calculate would be a confidence
    interval for the proportion (easy to do by hand even for someone who is
    poor in statistics). Report the proportion and the confidence interval. If
    the confidence intervals for the two proportions overlap, it wouldn't be
    wise to claim overuse or underuse. (You could do much fancier things,
    statistically, but I'd advocate this as a start; it has an obvious
    intuitive interpretation and it's easy to calculate.) Whether you really
    think that the overuse is "a lot more", or "more to an important extent"
    depends on ones judgement and interpretation and on the relationship of
    this finding to research hypotheses and theoretical rationale.

    The way to avoid the "so-what syndrome" is to have a clear theoretical
    rationale for your research hypotheses. In fact, even the question
    appropriate statistical techniques is hard to answer at more than a very
    basic level without a theoretically grounded research question.

    For example, it has been proposed (e.g. by J. Schachter 1974) that
    native-speakers of Chinese will underuse relative clauses in English. Her
    study, which tended to confirm her predictions, was based on her concept of
    "a priori contrastive analysis" (that is, it relied on a linguistic
    comparison of relative clause formation in Chinese and English plus a
    theory of interlanguage identifiability and some concept of the conditions
    which would give rise to underuse.) In contrast, it might be that Chinese
    learners of English underproduce relative clauses in English because the
    rate of relative clause use in Chinese itself if lower than the rate of
    relative clause use in English. In order to test this idea, you'd need to
    make corpus comparisons relative clause of native English and native
    Chinese as well as of Chinese learners of English.

    Robert Bley-Vroman

    --
    * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
    Robert Bley-Vroman
    Department Chair              MA Program in ESL and
    Second Language Studies       PhD Program in Second Language Acquisition
    University of Hawai'i         Graduate Faculty of Linguistics
    1890 East-West Road           Associate Director for Technology
    Honolulu HI 96822             National Foreign Language Resource Center
    (808)956-2800; fax: (808)956-2802
    mailto:vroman@hawaii.edu      http://www.sls.hawaii.edu/bley-vroman/
    



    This archive was generated by hypermail 2b29 : Wed Dec 12 2001 - 00:55:57 MET