Re: Corpora: Summary: automatic thesaurus generation

From: Mark Sanderson (m.sanderson@shef.ac.uk)
Date: Sun Jan 27 2002 - 21:20:05 MET

  • Next message: Knut Hofland: "Corpora: CFP ECAI2002 WS : Machine Learning and Natural Language Processing for Ontology Engineering]] (fwd)"

    I must have missed your query regarding this topic, but you might wish to
    look at a paper I wrote with Bruce Croft at SIGIR 1999 about concept
    hierarchies, essentially automatic generation of hierarchical thesauri.

    Also I did some work with a student of mine called Hideo Joho, who worked
    on Marti Hearst's ideas, that work was published in ACM CIKM 2000 and a
    poster in the HLT 2001 conference that is somewhat related.

    Bruce Croft has a PhD student who is working on aspect of the concept
    hierarchies, I think she had a paper at SIGIR 2001, her name is Dawn Lawrie.

    At 10:55 27/01/02 +0100, Adam =?iso-8859-2?q?Przepi=F3rkowski?= wrote:
    >This a brief summary of responses to my query regarding automatic
    >thesaurus generation from large corpora. I am very grateful to Bob
    >Krovetz, Johan Hagman, Sara Rydin and Bill Mann for helpful
    >suggestions.
    >
    >The following people worked or are working on automatic generation of
    >meaningful hierarchical thesauri:
    >
    >Sharon Caraballo (esp. her recent Ph.D. dissertation available from
    > her home page);
    >Marti Hearst (a 1992 paper available from Marti Hearst's home page);
    >Gregory Grefenstette (I found it more difficult to locate relevant
    > papers);
    >Johan Hagman (results will be presented at JADT
    > http://www.irisa.fr/manifestations/2002/JADT/programme.htm#programme);
    >Sara Rydin (started work on this for her Ph.D. thesis).
    >
    >Virtually all of the work I located concentrates on automatic
    >detection of hyponymy/hypernymy relations on the basis of textual
    >clues such as "X, including x, y and z" (this normally implies that x,
    >y and z are kinds of X).
    >
    >Bill Mann also mentions the the Oingo search engine which, it is
    >claimed, actually takes advantage of such techniques.
    >
    >Best,
    >--
    > Adam P.

    ____________________________________________________________________
    Mark Sanderson, Room 303 Tel : +44 (0) 114 22 22648
    Department of Information Studies Fax : +44 (0) 114 27 80300
    University of Sheffield, mailto:m.sanderson@shef.ac.uk
    Western Bank, Sheffield, S10 2TN, UK http://dis.shef.ac.uk/mark/
    ____________________________________________________________________
    Good judgement comes from experience but experience comes from bad judgement



    This archive was generated by hypermail 2b29 : Mon Jan 28 2002 - 09:20:02 MET