[Corpora-List] in need of a specialized lexicon (summary)

From: Joel Tetreault (tetreaul@cs.rochester.edu)
Date: Mon Sep 27 2004 - 18:12:04 MET DST

  • Next message: Arno Scharl: "[Corpora-List] Web Content Analysis of US Election 2004"

    Hi, I'd like to thank everyone who emailed me about my request for a
    comprehensive lexicon containing semantic (or quasi-semantic) noun
    features such as mass/count, abstract/concrete,
    object/measure/event/state/process/etc, part/whole,
    etc., on top of verb frames with argument-type preferences.

    Here's a summary of the information provided by listmembers:

    1. Oxford Advanced Learner's Dictionary of current English (text number
    0710 in the Oxford Text Archive, or at
    http://www.gtoal.com/wordgames/ota/710/ ) was prepared by Roger Mitton,
    and includes noun features including countable/uncountable/proper and an
    interesting but very non-standard verb frame structure. Note that the
    data was produced in 1986 and updated in 1992.
    (thanks to Jonathan Young <jonathan_young@comcast.net>)

    2. The Specialist Lexicon of the Unified Medical Language System (lexical
    needs for the medical community). This
    lexicon contains over 220,000 terms and was developed to provide the
    lexical information needed for the SPECIALIST Natural Language
    Processing System. It is intended to be a general English lexicon that
    includes many biomedical terms. Coverage includes commonly occurring
    English words and biomedical vocabulary. The data elements in the
    lexicon describe syntactic characteristics of each entry, including
    inflection codes, case, gender, syntactic category, complements for
    verbs and nouns, modification types for adverbs, and more. This is
    lexicon was developed as a free, publicly available resource, with only
    moderate restrictions (e.g., you can't claim it as your own)."

    3. http://www.clres.com/lexdata.html - links to lexicon data

    (previous two thanks to Ken Litkowski ken@clres.com)

    4. Longman Dictonaries:
    * Longman Dictionary of Contemporary English, Lisp version (LDOCE Lisp -
    1978):
    http://www.longman.com/dictionaries/research/reslisp.html
                                                                                    
    * Longman Dictionary of Contemporary English, NLP version (LDOCE NLP -
    2000):
    http://www.longman.com/dictionaries/research/resnlapp.html#4
                                                                                    
    (thanks to "Crowdy, Steve" <Steve.Crowdy@pearson.com>)

    5. Unitex: http://www-igm.univ-mlv.fr/~unitex/ has features such
    animate, conrete, abstract, unit of measure, collective, etc. For
    Engliush and French

    (thanks to Sebastian Nagel <wastl@cis.uni-muenchen.de>)

    6. Comprehensive lexicon for Italian (7000 entries) and a smaller one for
    English (3300 entries) - see Rodolfo Delmonte (1995), "Lexical
    Representations: Syntax-Semantics interface and World Knowledge," in
    Rivista dell'AI*IA (Associazione Italiana di Intelligenza Artificiale),
    Roma, pp.11-16. for a summary of his group's work.

    Thanks to all who emailed me, it was a great help.

    Joel



    This archive was generated by hypermail 2b29 : Mon Sep 27 2004 - 18:19:54 MET DST