Re: algorithms to generate word forms

Alex Chengyu Fang (alex@phonetics.ucl.ac.uk)
Thu, 20 Feb 1997 23:05:20 +0000

At 03:58 PM 21/2/97, John Milton wrote:
>Can anyone tell me whether there is a publicly available set of algorithms
>that
>1) stems, i.e., produces real-word lemmas from the inflected and derived
>forms and
>2) generates the derivational & inflected forms from the base form.
>

The fully automatic text annotation system, AUTASYS 4.0, tags unrestricted
English texts with LOB, ICE, and SKELETON tagsets and produces lemmas for
tagged items in the form:

PRON(pers,sing) She she
AUX(prog,past) was be
ADV(add) too too
ADV(ge) late late
PUNC(com) , ,
V(intr,ingp) arriving arrive
PREP(ge) at at
ART(def) the the
N(com,sing) church church
ADV(excl) just just
PREP(ge) as as
ART(def) the the
N(com,sing) ceremony ceremony
AUX(pass,past) was be
V(montr,edp) completed complete
PUNC(per) . .
PREP(ge) With with
V(montr,ingp) intervening intervene
N(com,plu) circumstances circumstance
PUNC(com) , ,
ART(def) the the
ADV(phras) run-away run-away
N(com,sing) couple couple
AUX(pass,subjun) were be
ADV(ge,comp) later late
V(montr,edp) forgiven forgive
CONJUNC(coord) and and
NUM(card,sing) 6 6
N(com,plu) years year
ADV(ge,comp) later late
PUNC(com) , ,
PREP(ge) with with
NUM(card,sing) three three
ADJ(ge) young young
N(com,plu) children child
PUNC(com) , ,
N(prop,sing):1/2 John John
N(prop,sing):2/2 Bunyan Bunyan
PUNC(com) , ,
N(prop,sing) Margaret Margaret
CONJUNC(coord) and and
N(prop,sing) Jessie Jessie
PUNC(com) , ,
PRON(pers,sing) they they
V(intr,past) sailed sail
PREP(ge) to to
N(prop,sing):1/2 New New
N(prop,sing):2/2 Zealand Zealand
PREP(ge) on on
ART(def) the the
N(prop,sing):1/2 scow scow
N(prop,sing):2/2 Spray Spray
PUNC(com) , ,
V(intr,past) built build
CONJUNC(coord) and and
V(intr,past) owned own
PREP(ge) by by
N(prop,sing) Hugh Hugh
N(prop,sing) McKenzie McKenzie
GENM 's 's
N(com,plu) cousins cousin
PUNC(com) , ,
N(prop,sing) Duncan Duncan
CONJUNC(coord) and and
N(prop,sing):1/2 Angaus Angaus
N(prop,sing):2/2 Matheson Matheson
PUNC(per) . .

---------------------------------------
Alex Chengyu Fang
Research Fellow
Department of Phonetics and Linguistics
University College London
Gower Street, London WC1E 6BT
U.K.

E-Mail: alex@phonetics.ucl.ac.uk
Tel: 0171 388 4309
0171 387 7050 ext. 3169
Fax: 0171 383 4108
---------------------------------------