Re: [Corpora-List] corpus ------>>>>> thesaurus

From: Dominic Widdows (widdows@maya.com)
Date: Tue Nov 09 2004 - 15:42:07 MET

  • Next message: Gaurav Malhotra: "[Corpora-List] French tagger"

    > Hi Vladimir,
    >
    > You can find a good introduction to lexical acquisition methods based
    > on
    > co-occurrence statistics in Manning and Schuetze's "Foundations of
    > Statistical Natural Language Processing".

    Hi Vladimir,

    Just to add to Viktor's suggestion - we have a few demos of thesaurus
    generation / lexical acquisition some of which are based directly on
    Shuetze's work, at
    http://infomap.stanford.edu/webdemo

    There are a couple of fairly domain-specific models built from the
    Ohsumed medical corpus and the Wall Street Journal (though the latter
    has a lot of general topics as well).

    You can find links to papers (including work on mapping words and
    senses from corpus derived models into hand-built lexical resources)
    and some software for processing corpora into vector word-association
    models (using a form of latent semantic analysis) from the main site at
    http://infomap.stanford.edu/

    Best wishes,
    Dominic



    This archive was generated by hypermail 2b29 : Tue Nov 09 2004 - 16:06:28 MET