[Corpora-List] Semantic Similarity summary

From: Daniel Midgley (dmidgley@arts.uwa.edu.au)
Date: Sun Nov 03 2002 - 06:45:51 MET

  • Next message: Diego Molla: "[Corpora-List] Call for Participation -- ANLP2002"

    Thanks to all to responded to my inquiry about semantic similarity using
    distributional techniques.
    Here are some of the results.

    Websites:

    * http://www.ilc.pi.cnr.it/EAGLES96/rep2/node37.html
    A quick rundown of methods and concepts involved in Word Clustering.

    * http://www.cs.ualberta.ca/~lindek/demos.htm
    A quite enjoyable set of demos. You can type in words and look for similar
    words in a newspaper corpus based on measures of dependency-based and
    proximity-based similarity. There's also a "Usage Checker", where language
    learners can get suggestions for the most common usages for pairs of
    keywords (so as to avoid anomalous usages).

    * Latent Semantic Analysis
    http://lsa.colorado.edu/whatis.html
    A corpus-based method of analysing the content of documents based on word
    distibution.

    Articles:

    * Demetriou G and Atwell E. 2001. A domain-independent semantic tagger for
    the study of meaning associations in English text. In Harry Bunt, Ielka
    van der Sluis and Elias Thijsse (editors), Proceedings of the Fourth
    International Workshop on Computational Semantics (IWCS-4) pp.67-80.
    Tilburg, Netherlands. ISBN: 90-74029-16-7.
    http://www.comp.leeds.ac.uk/eric/iwcs.ps

    * Wilson, A. and Rayson, P. 1993. Automatic Content Analysis of Spoken
    Discourse: a report on work in progress. In: C. Souter and E. Atwell
    (eds), Corpus Based Computational Linguistics. Amsterdam: Rodopi. pp215-226
    http://www.comp.lancs.ac.uk/computing/research/ucrel/papers/war93.txt

    * Wilson, A. and Thomas, J.A. 1997. Semantic annotation,
    in Garside, R., Leech, G., and McEnery, A. (eds.) Corpus Annotation:
    Linguistic Information from Computer Text Corpora. Longman, London, pp.53-65.

    * Natural Semantic Metalanguage
    This is an ongoing attempt by Cliff Goddard and Anna Wierzbicka (and
    others) to find the language primitives (or the "semantic core") that are
    present in all languages.
    http://www.une.edu.au/arts/LCL/disciplines/linguistics/nsmpage1.htm

    * Sahlgren | 2001: Vector-Based Semantic Analysis: Representing Word
    Meanings Based on Random Labels
    Author's website:
    http://www.sics.se/~mange/

    * Pereira F., Tishby N., and Lee L. (1993) Distributional clustering of
    English words. In Proc. of the 31st Annual Meeting of the ACL, pp. 183-190.
    http://citeseer.nj.nec.com/pereira93distributional.html

    * Vasileios Hatzivassiloglou and Kathleen McKeown. 1993. Towards the
    automatic identication of adjectival scales: Clustering of adjectives
    according to meaning. In 31st Annual Meeting of the ACL, pages 172-182.
    http://citeseer.nj.nec.com/context/114108/0 (Not the article itself, but
    similar ones.)

    A link to a query for CiteSeer:
    http://citeseer.nj.nec.com/cs?q=Distributional+clustering&submit=Search+Document
    s&cs=1

    =-=-=-=-=-=-=-=-=-=-=-=-=-=-=
    Daniel Midgley
    dmidgley@arts.uwa.edu.au
    + (61 8) 9371 3730
    http://www.cs.uwa.edu.au/~fontor



    This archive was generated by hypermail 2b29 : Mon Nov 04 2002 - 07:01:24 MET