Re: Corpora: testing association strength between elements of trigrams

From: Gabriel Pereira Lopes (
Date: Wed Feb 16 2000 - 02:22:24 MET

  • Next message: Gabriel Pereira Lopes: "Re: Corpora: language engineering"

    We have done for n-grams with better results that contrast with the ones obtained
    by Dunning. See:

    J.F.Silva, G. Dias, S. Guilloré, J.G.P. Lopes. 1999. "Using LocalMaxs Algorithm
    for the Extraction of Contiguous and Non-contiguous Multiword Lexical Units". In:
    P. Barahona (ed.) Progress in Artificial Intelligence: 9th Portuguese Conference
    on AI, EPIA'93, Évora Portugal September 1999, Proceedings. Lectures Notes in
    Artificial Intelligence, Springer-Verlag, Vol. 1695, p. 113-132 (1999).

    J.F. da Silva, and J.G.P.Lopes. 1999." Extracting Multiword Terms from Document
    Collections". In Proceedings of the VExTAL: Venezia per il Trattamento Automatico
    delle Lingue, November 22-24, 1999

    J.F. da Silva, J.G.P.Lopes, M. F. Xavier, and G. Vicente. 1999. "Relevant
    Expressions in Large Corpora". In Anne Condamines, Cécile Fabre et Marie-Paule
    Péry-Woodley (eds.) Actes de l'atelier "Corpus et Traitement Automatique des
    Langues: Pour une réflexion méthodologique" (TALN'99) , Institut d´Etudes
    Scientifiques, Cargèse, Corse (France), July 12-17}. Pp. 86-94. Published by ATALA

    J.F. da Silva, and J.G.P.Lopes. 1999. "A Local Maxima method and a Fair Dispersion
    Normalization for extracting multi-word units from corpora". In : Proceedings of
    the Sixth Meeting on Mathematics of Language (MOL6) , Orlando, Florida July 23-25,
    1999. pp. 369---381

    Gael Dias, Sylvie Guilloré, José Gabriel P. Lopes. 1999. "The Multilingual Aspects
    of Multiword Lexical Units". In: Spela Vintar (ed.) Proceedings of the Language
    Technologies Workshop, organized in the framework of the 32nd Annual Meeting of
    the Societas Linguistica Europea (SLE99), Arts Faculty, University of Ljubliana,
    Lubljiana, Slovenia, July 8-11, 1999}. pp. 11-21.ISBN 961-227-003-1

    DIAS, Gaël; Guilloré, Sylvie; Lopes, Gabriel (2000). Normalisation of
    Association Measures for Multiword Lexical Unit Extraction. In
    "International Conference on Artificial and Computational Intelligence
    for Decision, Control and Automation in Engineering and Industrial
    Applications", Monastir, Tunisia.

    DIAS, Gaël; Guilloré, Sylvie; Lopes, Gabriel (2000). Extraction
    Automatique d'Associations Textuelles à partir de Corpora non Traités.
    JADT 2000 : 5es Journées Internationales d'Analyse Statistique des
    Données Textuelles, Lausanne, Suisse.

    DIAS, Gaël; Guilloré, Sylvie; Lopes, Gabriel (1999): "Language
    Independent Automatic Acquisition of Rigid Multiword Units from
    Unrestricted Text corpora", Actes Traitement Automatique des Langues
    Naturelles. Institut d'Etudes Scientifiques, Cargèse, France.

    DIAS, Gaël; Guilloré, Sylvie; Lopes, Gabriel (1999): "Multilingual
    Aspects of Multiword Lexical Units", Actes Workshop on Language
    Technologies, Ljubljana, Slovenia.

    DIAS, Gaël; Guilloré, Vintar Spela; Sylvie; Lopes, Gabriel (1999):
    "Identifying and Integrating Terminologically Relevant Multiword Units
    in the IJS-ELAN Slovene-English Parallel Corpus", Actes 10th CLIN,
    Utrecht Institute of Linguistics OTS.

    DIAS, Gaël; Guilloré, Sylvie; Lopes, Gabriel (1999): "Mutual
    Expectation: a Measure for Multiword Lexical Unit Extraction", Actes
    VExTAL Venezia per il Trattamento Automatico delle Lingue, Universitá Cá
    Foscari, Venezia.

    DIAS, Gaël; Guilloré, Sylvie; Lopes, Gabriel (1999): "Multiword Lexical
    Units Extraction", Actes International Symposium on Machine Translation
    and Computer Language Information Processing, Beijing, China.

    Best regards,

    Gabriel Pereira Lopes

    John Colby wrote:

    > I would like to use likelihood ratios, as has been done in Dunning[1993]
    > for bigrams, to test the amount of association between the elements of
    > trigrams. Dunning did this for a bigram AB by determining if the distribution
    > of A given that B is present is the same as A given that B is not present.
    > To do something similar for trigrams, is it sufficient to determine for
    > a trigram ABC if the distribution of A given the presence of B and C is
    > the same as the distribution of A given that both B and C are not present?

    This archive was generated by hypermail 2b29 : Thu Feb 17 2000 - 14:25:13 MET