Re: Corpora: Statistical significance of tagging differences

Ted E. Dunning (ted@aptex.com)
Fri, 19 Mar 1999 15:13:59 -0800 (PST)

>>>>> "jf" == James L Fidelholtz <jfidel@siu.buap.mx> writes:

jf> On Fri, 19 Mar 1999, Paul Rayson wrote: ...
>> Can you direct me to a book or article which says chi-square is
>> designed for small numbers?

jf> the deal with chi square is that ... Chi square is then the
jf> SUM of all these values, whatever they are, and obviously if
jf> the column is anything other than the same value repeated over
jf> and over, each value diferent from the mean will produce a
jf> positive contribution to chi squared. SO, if you have bigger
jf> but nonequal entries you will get a bigger result.

This is not really the way that it works. For a fixed size table, the
same number of things are being added and they stay about the same
size because they are being normalized by the total number of
observations. For larger tables, the results do get bigger, but so
does the threshold. Overall, if there is no relationship and all
variations in counts are due to chance, then the distribution of the
chi^2 statistic will stay nearly the same no matter how large the
counts in the table become.

On the other hand, as you get more data, you (and chi^2) will be able
to detect more and more subtle variations from independence. With
language experiments, there are almost bound to be deviations from
independence due to the fact that language really does have abundant
structure. This pervasive structure can make it very difficult to
build completely balanced experiments. This difficulty is what others
on this list have been railing about.

jf> Chi square is 'significant' beyond a certain positive value,
jf> and if the numbers are big enough, it is almost guaranteed to
jf> be significant.