Corpora: Reuters Corpus

From: Tony.Rose@reuters.com
Date: Tue Aug 14 2001 - 12:48:08 MET DST

  • Next message: Antal van den Bosch: "Corpora: TiMBL 4.0 - new release of Tilburg Memory-Based Learner"

    Reuters, the global information, news and technology group, is for the first time making available free of charge, large quantities of archived Reuters news stories for use by research communities around the world. The first Reuters Corpus archive includes over 800,000 English language news stories, equivalent to the annual global news output of Reuters. All the news stories are fully referenced using a total of 775 different category codes for topic, geography and industry sector.

    Although this Corpus has been available for some time, it has not yet been widely publicised. We are now happy to distribute it more widely within the research community. Further details can be found at:

    http://about.reuters.com/researchandstandards/corpus/

    For discussion and queries regarding this corpus and future Reuters releases, please refer to the ReutersCorpora mailing list, which can be found at:

    http://groups.yahoo.com/group/ReutersCorpora

    Best wishes,
    Tony
    ==========
    Dr TG Rose
    Leader of Language Technology
    Reuters Limited, 85 Fleet Street, London EC4P 4AJ
    Email: Tony.Rose@reuters.com

    -----------------------------------------------------------------
            Visit our Internet site at http://www.reuters.com

    Any views expressed in this message are those of the individual
    sender, except where the sender specifically states them to be
    the views of Reuters Ltd.



    This archive was generated by hypermail 2b29 : Tue Aug 14 2001 - 12:44:48 MET DST