Re: Corpora: multilingual texts

Ted E. Dunning (ted@aptex.com)
Tue, 2 Dec 1997 12:36:16 -0800

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: Dongsong Zhang: "Corpora: About the machine translation"
Next message: Jean Hudson: "Corpora: CORPUS EMAIL"
Previous message: D C Souter: "Corpora: multilingual texts"
Maybe in reply to: D C Souter: "Corpora: multilingual texts"

I did some work on language identification and have an evaluation
corpus available for anybody who wants to try their hand. This corpus
was developed by taking random samples from a Spanish/English parallel
corpus.

I include with the test corpus both a technical report (somewhat
outdated) and working code (also somewhat outdated).

You can ftp the 1995 version of the test corpus/paper/code from

ftp://crl.nmsu.edu/pub/misc/lingdet_suite.tar.gz

If you want the latest description and code, please email me.

Next message: Dongsong Zhang: "Corpora: About the machine translation"
Next message: Jean Hudson: "Corpora: CORPUS EMAIL"
Previous message: D C Souter: "Corpora: multilingual texts"
Maybe in reply to: D C Souter: "Corpora: multilingual texts"