Corpora: Suggestor algorithms

Martin Kay (kay@parc.xerox.com)
Tue, 7 Oct 1997 11:27:08 PDT

It seems to me that what you are looking for is less a new algorithm than
an embodiment of something along the following lines. You can design a
finite-state transducer that will do a half-way decent job of mapping
between spellings and their possible pronunciations. Composing this
transducer with its own inverse gives something that carries spellings onto
other spellings that share some possible pronunciation with the
original---interestingly enough, without containing any explicit
representation of pronunciations at all. Now what you probably want is a
composition of two machines that are not exact inverses of one another.
You probably want one that carries spellings onto *probable* pronunciations
on the input side and one that maps pronunciations onto *possible* spellings
on the output side.

--Martin Kay