Re: [Corpora-List] Statistical tests for corpus studies

From: Adam Kilgarriff (adam.kilgarriff@itri.brighton.ac.uk)
Date: Wed May 07 2003 - 11:45:19 MET DST

  • Next message: John Sinclair: "[Corpora-List] Last minute reminder: Tuscan Word Centre"

    Josephine,

    chi-square will probably not give you what you want, nor will
    log-likelihood - my paper on "Comparing Corpora" (Int Jnl Corpus
    Linguistics 2001) explains why. Non-parametric tests are more suitable,
    I found the Mann-Whitney test did the job well. It involves chopping
    each corpus up into same-size slices.

    Regards,

        Adam

    Josephine Lo wrote:

    > Dear all,
    >
    > As a lay-man to statistics, I wish to get some advice on the tests
    > suitable for comparing the frequency of a specific type of word in
    > corpora of different genre. Having in mind are Chi-square and ANOVA
    > but I'm not sure they are the appropriate ones.
    >
    > Thanks in advance
    >
    >
    > Josephine Lo
    > Research Assistant
    > Dept. of English and Communication
    > City University of Hong Kong
    >

    -- 
    

    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Adam Kilgarriff ITRI, University of Brighton tel: (44) 1273 642919 Lewes Road, Brighton BN2 4GJ, UK fax: (44) 1273 642908 adam@itri.bton.ac.uk http://www.itri.bton.ac.uk/~Adam.Kilgarriff and Lexicography MasterClass Ltd. 71 Freshfield Road, Brighton BN2 0BL, UK tel: (44) 1273 705773 adam@lexmasterclass.com http://www.lexmasterclass.com %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%



    This archive was generated by hypermail 2b29 : Wed May 07 2003 - 11:47:03 MET DST