comparing corpora?

Mr A.P. Berber Sardinha (tony1@liverpool.ac.uk)
Wed, 21 Jun 1995 17:30:38 +0100 (BST)

Hello,
I've been comparing corpora by contrasting their
respective word frequency lists using a program
that reads both lists and returns chi-square
statistics. The program extracts those words
whose frequencies are different between the two
corpora and presents them as 'keywords'. The
keywords are therefore words which are used
significantly more or less often than expected
in one corpus than in the other.