Re: [Corpora-List] token clustering tool

From: Jose Maria Gomez Hidalgo (jmgomez@uem.es)
Date: Tue May 11 2004 - 10:19:33 MET DST

  • Next message: Magali Jeanmaire: "[Corpora-List] LREC 2004 - On-line registration"

    At 09:24 11/05/2004, Murk Wuite wrote:
    >Dear all,
    >
    >Does anyone know of a tool (or algorithm), preferably available freely
    >for research purposes, that takes as its input a corpus only and
    >produces as its output clusters of tokens that occur close to each other
    >relatively often?

    It is possible that the document clustering toolkit CLUTO fit your
    necessities, perhaps with some adaptation.
    http://www-users.cs.umn.edu/~karypis/cluto/

    >Best wishes,
    >
    >Murk Wuite
    >MA student at the Department of Language and Speech, Katholieke
    >Universiteit Nijmegen, The Netherlands

    Jose Maria Gomez Hidalgo
    Departamento de Inteligencia Artificial
    Universidad Europea de Madrid
    28670 - Villaviciosa de Odon - MADRID
    (+34) 912115670
    jmgomez@uem.es
    http://www.esi.uem.es/~jmgomez/

    La legislación española ampara el secreto de las comunicaciones. Este
    correo electrónico es estrictamente confidencial y va dirigido
    exclusivamente a su destinatario/a. Si no es Ud., le rogamos que no difunda
    ni copie la transmisión y nos lo notifique cuanto antes.

    Spanish law guarantees privacy in electronic communications. This
    electronic transmission is strictly confidential and intended solely for
    the addressee. If you are not the intended addressee, you are kindly
    requested not to disclose nor to copy this transmission and to notify us as
    soon as possible.



    This archive was generated by hypermail 2b29 : Thu May 13 2004 - 11:29:39 MET DST