[Corpora-List] unencumbered corpora

From: Lou Burnard (lou.burnard@computing-services.oxford.ac.uk)
Date: Fri Jan 21 2005 - 19:55:23 MET

  • Next message: Irena Shuke: "[Corpora-List] My semantic prosody questionnaire"

    Can anyone point me to any annotated language corpora which are freely
    available under something like the GNU Public Licence? All the ones I
    have thought of so far seem to be available only under some kind of
    complicated licensing scheme which precludes (e.g) commercial
    exploitation, unrestricted copying, etc. And cost money.

    I'd like to have a corpus of a reasonable size (1 million+ words) in any
      European language (tho English or French are preferable) with some
    kind of word-level annotation, which I can hack about, use in teaching,
      and put on a freely-distributable CD, without worrying about copyright
    lawyers. There *must* be some somewhere!

    It doesn't even have to be in XML -- though it will be when I've
    finished with it.

    Lou Burnard



    This archive was generated by hypermail 2b29 : Fri Jan 21 2005 - 19:53:06 MET