Corpora: plain text

From: David Grant (drg199@ecs.soton.ac.uk)
Date: Thu Dec 06 2001 - 17:49:05 MET

  • Next message: LDC Office: "Corpora: New Release from the LDC"

    Hi,

    I'm looking for plain text, tokenized, english text, with which to test a tagger. Does anyone know where i could find some.

    by tokenized i mean all words and punctuation must be separated by atleast one space.

    ie

    hello how are you ? I am fine .

    cheers

    David Grant
    drg199@ecs.soton.ac.uk



    This archive was generated by hypermail 2b29 : Thu Dec 06 2001 - 17:58:52 MET