Re: [Corpora-List] semantic similarity

From: ted pedersen (tpederse@d.umn.edu)
Date: Thu Jan 20 2005 - 19:47:16 MET

  • Next message: Dominic Widdows: "Re: [Corpora-List] semantic similarity"

    Hi Jana,

    WordNet-Similarity is an implemented system that will let you measure the
    semantic similarity between words in text using a host of well known
    methods, including those of Resnik, Jiang & Conrath, Leacock & Chodorow,
    Wu & Palmer, shortest path, adapted lesk, Hirst & St-Onge, and even a
    context vector measure. It does all this based on information from
    WordNet. The code is in Perl and is free, and of course WordNet is free
    too. Download it from:

            http://search.cpan.org/dist/WordNet-Similarity or
            http://wn-similarity.sourceforge.net

    Now, WordNet-Similarity will get you started in measuring semantic
    similarity (or relatedness, with a few measures). We have also been
    working on an algorimth based WordNet-Similarity that will measure how
    related a word is to its neighbors in a text.

    This algorithm is called WordNet-SenseRelate and can be used with plain
    text, and is again based on WordNet. Our goal in this package is to
    carry out word sense disambiguation of all the content words in a text,
    but what's really happening under the surface is what you are aspiring to
    do, and that is find nearby words that are similar to each other
    (in our case according to the measures in WordNet-Similarity).

    Again in Perl, and again free. Download from:

            http://search.cpan.org/dist/WordNet-SenseRelate
            http://www.d.umn.edu/~tpederse/~senserelate.html

    I hope one or both of these are of interest to you. Let us know if you
    have any additional questions!

    Cordially,
    Ted

     On Thu, 20 Jan 2005, Jana Diesner wrote:

    > Dear list members,
    >
    > We are looking for strategies, algorithms or code to automatically find
    > single terms or multiple adjacent terms that are semantically similar within
    > and across documents. The approach must not require POS tagging or an
    > initial input of a reference term to the system. The resulting clusters of
    > semantically similar terms suggested by the system do not need to be
    > exclusive. We are familiar with secondstring, the software developed by
    > William Cohen, and semantic similarity based on string-edit distances.
    >
    >
    >
    > Thank you very much.
    >
    > Jana
    >
    >
    >
    > ____________________
    >
    > Jana Diesner
    > Carnegie Mellon University
    >
    > jdiesner@andrew.cmu.edu
    >
    >
    >

    --
    Ted Pedersen
    http://www.d.umn.edu/~tpederse
    



    This archive was generated by hypermail 2b29 : Thu Jan 20 2005 - 19:52:46 MET