Re: [Corpora-List] Developing and testing new similarity measures for word clustering

From: Dinoj Surendran (dinojs@gmail.com)
Date: Fri Oct 08 2004 - 23:19:43 MET DST

  • Next message: geoffrey.williams: "[Corpora-List] Journées de la Linguistque de Conférence 2005"

    One place where similarity evaluation metrics have come up is in, of
    all places, the music community...

    http://www.ee.columbia.edu/~dpwe/research/musicsim/metrics.html

    Also see the list of papers here
    http://www.ee.columbia.edu/~dpwe/research/musicsim/

    I'd be really interested to see what you come up with, and hope you
    post a summary.

    Cheers,

    Dinoj Surendran
    PhD Student
    Computer Science Department
    University of Chicago
    http://people.cs.uchicago.edu/~dinoj

    On Fri, 08 Oct 2004 08:47:02 -0400, Normand Peladeau
    <peladeau@simstat.com> wrote:
    > I have been reviewing some of the similarity measures used to perform word
    > clustering (Jaccard, Dice, Simple Matching, correlation, etc.) and I came
    > to the conclusion that many of those measures had some metric problems that
    > probably make them non optimal for word clustering.
    >
    > I am working now on some modified versions of those indices and I need some
    > ways to benchmark those new similarity measures. I would like to have a
    > series of benchmarks for several kinds of application (dimension reduction,
    > automatic identification of themes, automatic taxonomy development, etc.).
    >
    > I would like suggestions for ways to benchmark those new measures and
    > compare their performance with the more traditional ones. Any idea,
    > reference, data set would be welcome.
    >
    > I am also looking for existing articles where those measures have been
    > compared (either empirically or theoretically)
    >
    > Thanks,
    >
    > Normand Peladeau
    > Provalis Research
    >
    >



    This archive was generated by hypermail 2b29 : Fri Oct 08 2004 - 23:15:26 MET DST