RE: Corpora: lemma vs lexeme

Ji Donghong (dhji@krdl.org.sg)
Sat, 6 Nov 1999 13:55:42 +0800

Hi, all,

Relevant with this issue, is there any paper on automated lemmatisation
of English? or is there any more complete list of English lemmas? The
lemma I refer to is a root. For example, "compute" is the lemma of "computer",
"computing", "computation", "computational", "computers", "computes",
"computed", "computations", etc. Thanks.

Ji Donghong

-----Original Message-----
From: Mcenery, Tony [SMTP:eiaamme@exchange.lancs.ac.uk]
Sent: Friday, November 05, 1999 7:51 PM
To: 'Paul Hays'; Przemyslaw Kaszubski
Cc: corpus list
Subject: RE: Corpora: lemma vs lexeme

Hi Paul

> I did a Ph.D. with Sinclair at Birmingham in the early 90's which
> revolved around this topic. At that time, there were no efficient POS
> taggers and so lemmatization could not be carried out on a POS tagged
> text.
[Mcenery, Tony]
Sorry for the anglo-centric joke, but I must say "Shome mishtake
shurely"? There were, I have good reason to believe, efficient POS taggers
available for English from the mid-80s. Automated lemmatisation of English was
also being undertaken around that time (at least). Sorry to nitpick.

Tony