[Corpora-List] Meaning of Semcor annotations

From: Jose Maria Gomez Hidalgo (jmgomez@dinar.esi.uem.es)
Date: Thu May 22 2003 - 23:53:55 MET DST

  • Next message: Rada Mihalcea: "Re: [Corpora-List] Meaning of Semcor annotations"

    Dear all

    I am performing some experiments with the semantic concordance SemCor, and
    I have found some difficulties in interpreting available documentation. Any
    word in SemCor is labelled according to its meaning in WordNet:

    <wf cmd=done pos=VB lemma=say wnsn=1 lexsn=2:32:00::>said</wf>

    The word "said" has the part of speech VB (verb), its lemma is "say", and
    the corresponding meaning in WordNet can be got by searching for "say" and
    selecting the first sense (attribute wnsn). The attribute lexsn, according
    to the documentation, and appended to the lemma, identifies the WordNet
    synset for that meaning.

    However, the lexsn attribute value is not unique for the synset. Many other
    words in SemCor have the same value:

    <wf cmd=done pos=VB lemma=consider wnsn=4 lexsn=2:32:00::>considering</wf>
    <wf cmd=done pos=VB lemma=revise wnsn=1 lexsn=2:32:00::>revised</wf>

    (all three extrated from brown1/tagfiles/br-a01)

    Those words or lemmata do not belong to the same synset. It is important to
    know when word senses belong to the same synset, because this way synonym
    words __in the SemCor collection__ can be identified. The only way to know
    this, apart of consulting WordNet itself, is having unique synset
    identifiers in SemCor. Is the information in Semcor annotations enough to
    get that unique identification? How can we do it?

    Thank you

    _______________________________________________________________________________

    Jose Maria Gomez Hidalgo
    Departamento de Inteligencia Artificial
    Universidad Europea de Madrid
    28670 - Villaviciosa de Odon - MADRID
    (+34) 912115670
    jmgomez@dinar.esi.uem.es
    http://www.esi.uem.es/~jmgomez/
    _______________________________________________________________________________

    La legislación española ampara el secreto de las comunicaciones. Este
    correo electrónico es estrictamente confidencial y va dirigido
    exclusivamente a su destinatario/a. Si no es Ud., le rogamos que no difunda
    ni copie la transmisión y nos lo notifique cuanto antes.

    Spanish law guarantees privacy in electronic communications. This
    electronic transmission is strictly confidential and intended solely for
    the addressee. If you are not the intended addressee, you are kindly
    requested not to disclose nor to copy this transmission and to notify us as
    soon as possible.



    This archive was generated by hypermail 2b29 : Fri May 23 2003 - 10:05:16 MET DST