Corpora: Statistical significance of tagging differences

Mark Stevenson (m.stevenson@dcs.shef.ac.uk)
Wed, 17 Mar 1999 11:24:31 GMT

Hi,

I've been experimenting with PoS taggers operating under different conditions so
that I have several different taggings of the same corpus. I also have a "gold
standard" annotation for that text, so I can work out the percentage correct
for each tagging.

I was wondering if anyone knows of the appropriate statistical tests which could
be applied to determine whether the differences in tagging performace are
statistically significant?

Any pointers would be appreciated.

Thanks in advance,
Mark Stevenson

------------------------------------------------------------------------------
Mark Stevenson
Research Assistant marks@dcs.shef.ac.uk
Natural Language Processing Group http://www.dcs.shef.ac.uk/~marks
Sheffield University (0114) 222 1899
-----------------------------------------------------------------------------