Re: [Corpora-List] Synonyms

From: Anna-Maria De Cesare (decesare@duke.edu)
Date: Fri May 23 2003 - 16:35:01 MET DST

  • Next message: Yuri Tambovtsev: "[Corpora-List] I need texts in Tagalog, Indonesian, etc in electronic form"

    Hello again,

    I would like to thank every person who took the time to answer my
    question about synonyms and the use of quantitative methods to analyze
    them. I received numerous helpful links and materials (see below for a
    summary). Thank you to Mark Turner, Viktor Pekar, Stefan Th. Gries, Rada
    Mihalcea, Guy Aston, Stefan Schneider, Antoinette Renouf and Ramesh
    Krishnamurthy.

    I forgot to translate the two words I am interested in: Italian 'perfino'
    and 'persino' = 'even' (as a focus particle). They are function words and
    therefore it would (maybe?) be even be more surprising to have two
    identical forms here. From some preliminary tests, however, they seem very
    close indeed. What I am wondering about is, if I use quantitative
    measures, such as the z-score, what would be a sufficient difference
    between two measures (the point of delicacy or resolution) to distinguish
    synonyms from near-synonyms?

    Here is a list of articles and links I was referred to. Some of them are
    directly downloadable from the Internet.

    Christopher D. Manning & Hinrich Schuetze (1999) Foundations of
    Statistical Natural Language Processing, MIT Press, Massachusetts, US, Pp.
    680. (Chapter 8). With an implementation of the idea at:
    http://clg.wlv.ac.uk/demos/similarity/index.html .

    Church, Kenneth Ward and Patrick Hanks. 1990. Word Association Norms,
    Mutual Information, and Lexicography. Computational Linguistics 16:22-29.

    Church, Kenneth Ward, William Gale, Patrick Hanks and Donald Hindle.
    1991. Using Statistics in Lexical Analysis. In: Zernik, Uri (ed.).
    Lexical Acquisition: Exploiting On-line Resources to Build a Lexicon.
    Hillsdale, NJ: Lawrence Erlbaum, p. 115-164.

    Church, Kenneth Ward, William Gale, Patrick Hanks, Donald Hindle and
    Rosamund Moon. 1994. Lexical Substitutability. In: Atkins, Beryl T. Sue
    and Antonio Zampolli (eds.). Computational Approaches to the Lexicon.
    Oxford, New York: Oxford University Press, p. 153-177.

    Dekang Lin et al.: "Identifying Synonyms among Distributionally Similar
    Words" at: http://www.cs.ualberta.ca/~lindek/papers.htm.

    Diana Zaiu and Graeme Hirst's work on "near-synonymy" (and also Phil
    Edmonds). More material is available from their CL group web page
    http://www.cs.toronto.edu/compling/

    Gries, Stefan Th. 2001. A corpus-linguistic analysis of -ic and -ical
    adjectives. ICAME Journal 25:65-108.

    Gries, Stefan Th. 2003. Testing the sub-test: A collocational-overlap
    analysis of English -ic and -ical adjectives. International Journal of
    Corpus Linguistics 8(1):31-61. (which will come out in a few weeks or so).

    Krishnamurthy R. 1996: Ethnic, Racial and Tribal: The Language of Racism?
    (in Texts and Practices, eds. Caldas-Coulthard & Coulthard, Routledge,
    London)

    Krishnamurthy, R. 2000: Collocation: from silly ass to lexical sets
    (in Heffer, C. and Sauntson, H. (eds) 'Words in Context: A Tribute to
    John Sinclair on his Retirement'. Birmingham 2000.

    Krishnamurthy, R. (forthcoming): Corpus, Collocation, and Lexical Sets, in
    Proceedings of HUSSE (Hungarian Society for the Study of English) Thematic
    Conference, "Empirically Based Approaches to Linguistic Description",
    University of Debrecen, Hungary] [about sad/unhappy]

    Many thanks again,

    Anna-Maria De Cesare

    On Fri, 16 May 2003, Anna-Maria De Cesare wrote:

    >
    > Hello!
    >
    > I am currently working on two Italian words ('perfino' and 'persino'),
    > which I suspect to be absolute synonyms. My goal is to demonstrate their
    > synonymy by using quantitative methods (I will use the Italian corpus
    > CORIS, not yet pos-tagged).
    >
    > I was wondering if anybody could refer me to similar studies or could give
    > me a hint of how to procede. Any suggestion if welcome!
    >
    > Thank you very much in advance for your time,
    >
    > Anna-Maria De Cesare
    > -------
    > decesare@uchicago.edu
    > decesare@duke.edu
    > Visiting Scholar
    > Dept. of Romance Languages
    > and Literatures
    > University of Chicago
    >
    >



    This archive was generated by hypermail 2b29 : Fri May 23 2003 - 16:35:56 MET DST