Word frequency and language proficiency

DAVID JOHN CONIAM (B096770@idea.csc.cuhk.hk)
Sat, 01 Jul 1995 10:36:30 +0800

I'm playing around with word frequency lists to generate test items of
similar word frequency.

Thus say you want to test a word like "country" which is frequency 127
- i.e. the 127th most frequent word in English ("the" is frequency 1)
in the Bank of English's tagged wordlist, you might end up with the following
(the number after each word is the word's respective frequency)

government 100
state 124
thing 125
country 127
man 158

So with a word like "conviction" (freq. 4928), you'd get

plea 4905
copper 4910
presentation 4911
sigh 4912
conviction 4918

First reactions are that *not* choosing semantically-related words to sample
language proficiency must be wrong, altho there is research which suggests
a relationship between control of the (more and less) frequent words and
proficiency (Harlech-Jones, 1983; Meara and Jones, 1988)

Having trialed a couple of tests derived in such a way with students,
I didnt get total GIGO as I'd half been expecting.

Is anyone else working along similar lines, or wd like to comment?

David Coniam
Chinese University of Hong Kong