Language data

Torbj|rn Lager (
Mon, 18 Sep 95 10:13:27 +0200

I am tinkering with my own implementation of the Viterbi algorithm
applied to POS tagging. Now I would like to test my program on some
real text so I need statistical data of the following kind:

Lexical probabilities: Word-Tag-Probability triples (or something similar

Collocational probabilities: Tag1-Tag2-Probability triples

Preferably for English (Brown corpus stuff would be great) or Swedish
(SUC?). I know lots of such data has been produced. Is there any available?

Thanks in advance.

Best regards,
Torbjoern Lager


Torbjoern Lager E-mail:
Department of Linguistics Phone: +46 31 7731175
University of Gothenburg Fax: +46 31 7734853
412 98 Gothenburg