Re: Corpora: Negative mutual information?

From: Ted Pedersen (ted_pedersen@hotmail.com)
Date: Thu Mar 08 2001 - 16:49:00 MET

  • Next message: Philip Resnik: "Re: Corpora: Negative mutual information?"

    Hi David,

    I'm guessing that what you are looking at are pointwise
    Mutual Information values, usually defined along these
    lines for a bigram 'word1 word2':

    log [freq(word1,word2)*N/freq(word1)*freq(word2)]

    where N is the number of bigrams in your sample.

    This will go negative when

    freq(word1,word2)*N < freq(word1)*freq(word2)

    or

    N < freq(word1)*freq(word2)/freq(word1,word2)

    So what does a negative value tell us? Well, it suggests
    that word1 and/or word2 must be very high frequency words
    (the, and, a ... come to mind) that don't occur together
    in the bigram under consideration especially often.

    You can also look at the relationship

    freq(word1,word2) < freq(word1)*freq(word2)/N

    The right hand side of this inequality is the expected value
    for the frequency count of the bigram 'word1 word2' under
    the classical assumption of independence (which underlies
    tests like Pearson's and the loglikelihood ratio). So a
    negative pointwise mutual information value tells us that
    observed frequency count for a bigram is less than we would
    expect under the assumption that the words in the bigram
    are independent.

    I have puzzled a bit over this notion of being 'less than
    what would be expected under independence'. Does this just
    mean that the words in the bigram are independent, or is
    something further suggested? I'd be interested if anyone else
    has some thoughts on that particular issue...

    Anyways, I'm not sure how good a tool pointwise Mutual Information
    is anyway (see the Manning and Schutze text, for example, for
    some reasons for concern) but it does raise some interesting
    issues no doubt.

    Regards,
    Ted

    ---
    Ted Pedersen
    http://www.d.umn.edu/~tpederse
    _________________________________________________________________
    Get your FREE download of MSN Explorer at http://explorer.msn.com
    



    This archive was generated by hypermail 2b29 : Fri Mar 09 2001 - 01:16:40 MET