Re: A thesaurus with parts of speech

t-markl@microsoft.com
Fri, 6 Oct 95 20:39:14 PDT

Hi Ari,

The 1911 Roget's thesaurus is freely available and lists words of four
parts of speech (noun, verb, adjective and adverb) for each category.

|Does anyone of you know if there exists a thesaurus (for any language)
|that for each word lists some related words of different parts of speech,
|i.e. not just synonyms, quasi-synonyms, homonyms, etc.; e.g., for "eye" it
|would list "tear", "look", and "cry", among others.

For example, '#441 Vision' includes amongst the
81 verbs: "peer", "look" and "pry",

However, it does not make the jump from "eye" to "cry".
It might be possible to establish the link indirectly though.
Category '#839 Lamentation', which includes "tear" and "cry",
also contains the following:
with moistened eyes
cry one's eyes out
with watery eyes

You can get it from Project Gutenburg, courtesy of Patrick Cassidy
at Micra Inc: ftp://mrcnext.cso.uiuc.edu/etext/etext91/roget13a.txt

The file is in human readable form and so requires a bit of massaging
to get a machine tractable version. I have done this already for
the nouns (see Lauer, 1995; Resnik, 1995), and if anyone would
like to use my version, please email me and I will try to get
back to you as soon as I can.

Another possibility is to use WordNet, a freely available lexical
taxonomy consisting of small synonym sets (about 4 words in each)
linked by various semantic relations (ISA, HAS_PART, etc), which
also includes the 4 parts-of speech given above. However, it sounds
like you want a broader notion of 'related' than it offers. It was developed
by George Miller (1990) and associates. It contains around 167,000
word senses, including nouns, verbs, adjectives and adverbs.
ftp://clarity.princeton.edu/pub/wordnet/wn1.5unix.tar.gz.a

Best wishes,
Mark Lauer
Microsoft Institute
Sydney, Australia

Miller, G. (1990) WordNet: An On-line Lexical Database.
In International Journal of Lexicography, Vol. 3(4).

Lauer, M. (1995) Corpus Statistics Meet
The Compound Noun: Some Empirical Results.
In Proceedings of the 33rd Annual Meeting
of the Association for Computational Linguistics,
Cambridge, MA.

Resnik, P. (1995) Disambiguating Noun Groupings
with Respect to WordNet Senses
In Proceedings of the Third Workshop on Very Large Corpora,
Cambridge, MA.