[Corpora-List] Sense-tagged corpora

From: phil.edmonds@sharp.co.uk
Date: Wed Aug 14 2002 - 20:59:55 MET DST

  • Next message: Kiril Simov: "[Corpora-List] Treebanks and Linguistic Theories 2002 - Call for Participation"

    Dear CORPORA List Members,

    We are preparing the Introduction to a Special Issue of the
    Journal of Natural Language Engineering on Evaluating WSD Systems
    and would like to include details of as many word-sense-tagged corpora
    as possible. If you have any such resource, for any language, we
    would be interested in hearing about it - including, ideally, details
    of
       language
       size (total words, tagged words, and tagged word-types)
       text type
       date of collection
       purpose of collection
       source of the sense inventory
       availability

    No need to report on the following, which we are already aware of:

       SEMCOR
       HECTOR
       'line' corpus
       DSO corpus, Singapore
       Dutch children's books corpus
       Italian PAROLE corpus
       all datasets prepared for SENSEVAL 1 or 2

    We have also heard rumours of a picture library with sense-tagged captions
    on a large scale. More information most welcome.

    All leads and details of further sense-tagged corpora most welcome,

       Thank you in anticipation,

             Adam Kilgarriff and Phil Edmonds

    --
    Philip Edmonds                          (    phil@sharp.co.uk
    Sharp Laboratories of Europe Ltd         )   www.sle.sharp.co.uk
    Edmund Halley Road, Oxford Science Park (    +44 1865 747711 phone
    Oxford OX4 4GB, United Kingdom           )   +44 1865 714170 fax
    



    This archive was generated by hypermail 2b29 : Wed Aug 14 2002 - 21:19:41 MET DST