Corpora: Summary of lemmatizers

Yair Even-zohar (evenzoha@uiuc.edu)
Fri, 20 Aug 1999 18:00:28 -0500

Hi
A few weeks ago I was asking for a lemmatizer and I got few replies.
Thanks for all the people who replied :-)

Here is a short list of available lemmatizers

1) We have utilized the WORDNET lemmatizer to fit our system.
You can look at the Wordnet tools at:
http://www.cogsci.princeton.edu/~wn/

2) You can see and test a number of Xerox morphological analyzers at

http://www.xrce.xerox.com/research/mltt/Tools/morph.html

Morphological analyzers return the baseform (lemma name), and in
addition they separate and identify all the morphemes in the word.
A morphological analyzers can quite easily be "downgraded" into
a lemmatizer.

3) Another lemmatizer is a version of WordNet's 'morph', as improved by
Univs of Sheffield, Sussex and Edinburgh - email
johnca@cogs.sussex.ac.uk

4) WordSmith Tools can lemmatize word lists, but you'll need a file with
the
lemmatization rules. There's one such file with lemmatization rules for
English available at http://www.liv.ac.uk/~ms2928

cheers
-Yair