Re: [Corpora-List] Meaning of Semcor annotations

From: Rada Mihalcea (rada@cs.unt.edu)
Date: Fri May 23 2003 - 11:13:24 MET DST

  • Next message: OESI Informa: "[Corpora-List] Scholarships for Young Researchers for SEPLN 2003"

    Hi Jose,

    > The word "said" has the part of speech VB (verb), its lemma is "say", and
    > the corresponding meaning in WordNet can be got by searching for "say" and
    > selecting the first sense (attribute wnsn). The attribute lexsn, according
    > to the documentation, and appended to the lemma, identifies the WordNet
    > synset for that meaning.
    The attribute lexsn, appended to the lemma, will uniquely identify the
    meaning of the word (what you obtain in this way is not, however, the
    synset; instead, it points to one unique synset).

    > However, the lexsn attribute value is not unique for the synset. Many other
    > words in SemCor have the same value:
    The lexsn by itself does not have give you much useful information. As
    indicated by WordNet manuals, the fields in the lexsn indicate: part of
    speech, number of lexicographer file, and number within that file
    (adjectives would have some additional information). So there may be
    hundreds, or even thousands of different words having an identical lexsn
    (most of the words in a lexicographer file will have identical lexsn).
    You may want to check out the WordNet manuals for additional information
    http://www.cogsci.princeton.edu/~wn/man1.7.1/senseidx.5WN.html

    > Those words or lemmata do not belong to the same synset. It is important to
    > know when word senses belong to the same synset, because this way synonym
    > words __in the SemCor collection__ can be identified. The only way to know
    > this, apart of consulting WordNet itself, is having unique synset
    > identifiers in SemCor. Is the information in Semcor annotations enough to
    > get that unique identification? How can we do it?
    The easiest (and perhaps the fastest) way to find the synset of a word
    (given its lemma and lexsn), is to construct the sense key of a word
    (lemma%lexsn), and look this up in the index.sense file provided with the
    WordNet data files.

    hope this helps,
    -Rada



    This archive was generated by hypermail 2b29 : Fri May 23 2003 - 11:14:40 MET DST