Corpora: Syllable prediction errors

Bruce L. Lambert (lambertb@uic.edu)
Fri, 19 Feb 1999 18:01:15 -0600

I tested my syllable prediction program against 2063 generic drug names
with known pronunciations. I achieved 93% accuracy using the code I posted
yesterday. Below are the errors. It's mostly double vowel problems, "qu"
problems, and a few assorted silent "e" problems. I'll try to work around
these and I'll let you know how I do. Others might want to try their
techniques on the "difficult" words below.

Word Pronunciation Predicted Num
Sylls
---- -----------------------
-------------------
"Acacia" "a kay(') sha" 4
"Acetylcysteine" "a se teel sis(') teen" 6
"Albutoin" "al byoo(') toyn" 4
"Aldesleukin" "al des loo(') kin" 5
"Allantoin" "a lan(') toyn" 4
"Amodiaquine" "am oh dye(') a kwin" 6
"Amquinate" "am kwin(') ate" 4
"Atovaquone" "a toe(') va kwone" 5
"Auranofin" "au rane(') oh fin" 5
"Aurothioglucose" "aur oh thye oh gloo(') kose" 7
"Bentoquatam" "ben(') toe kwa tam" 5
"Benzocaine" "ben(') zoe kane" 4
"Benzoin" "ben(') zoin" 3
"Benzoxiquine" "ben zox(') i kwin" 5
"Benzquinamide" "benz kwin(') a mide" 5
"Bromelains" "broe(') me lains" 4
"Buquinolate" "byoo kwin(') oh late" 5
"Caffeine" "kaf(') een" 3
"Capsaicin" "kap say(') sin" 4
"Carrageenan" "kar a gee(') nan" 5
"Chloroquine" "klor(') oh kwin" 4
"Chymopapain" "kye(') moe pa pane" 5
"Clioquinol" "klye oh kwin(') ole" 5
"Cloxyquin" "klox(') i kwin" 4
"Cocaine" "koe kane(')" 3
"Codeine" "koe(') deen" 3
"Coumermycin" "koo mer mye(') sin" 5
"Cyproquinate" "sye proe kwin(') ate" 5
"Decoquinate" "de koe kwin(') ate" 5
"Dexivacaine" "dex iv(') a kane" 5
"Dextrates" "dex(') trates" 3
"Dezaguanine" "de(') za gwan een" 5
"Diaziquone" "dye ay(') zi kwone" 5
"Dibucaine" "dye(') byoo kane" 4
"Elucaine" "e loo(') kane" 4
"Ethotoin" "eth(') oh toyn" 4
"Ethylenediamine" "eth i leen dye(') a meen" 7
"Etidocaine" "e ti(') doe kane" 5
"Eugenol" "yoo(') je nole" 4
"Fenleuton" "fen loo(') ton" 4
"Fenquizone" "fen(') kwi zone" 4
"Ferumoxides" "fer yoo mox(') ides" 5
"Flosequinan" "floe se(') kwi nan" 5
"Flumequine" "floo(') me kwin" 4
"Fluorescein" "flure(') e seen" 5
"Fluorometholone" "flure oh meth(') oh lone" 6
"Fluorosalan" "flure oh sa(') lan" 5
"Fluorouracil" "flure oh yoor(') a sil" 6
"Fluproquazone" "floo proe(') kwa zone" 5
"Fluquazone" "floo(') kwa zone" 4
"Fosquidone" "fos(') kwi done" 4
"Guaiapate" "gwye(') a pate" 5
"Guaifenesin" "gwye fen(') e sin" 6
"Guaithylline" "gwye(') thi lin" 5
"Guanabenz" "gwahn(') a benz" 4
"Guancydine" "gwahn(') si deen" 4
"Guanoxabenz" "gwahn ox(') a benz" 5
"Halquinols" "hal(') kwin oles" 4
"Hydroquinone" "hye(') droe kwin one" 5
"Imiquimod" "i mi kwi(') mod" 5
"Iodoquinol" "eye oh doe kwin(') ole" 6
"Isoleucine" "eye soe loo(') seen" 5
"Isotiquimide" "eye soe ti(') kwi mide" 6
"Isotretinoin" "eye soe tret(') i noyn" 6
"Laurocapram" "lau roe ka(') pram" 5
"Leniquinsin" "len i kwin(') sin" 5
"Leucine" "loo(') seen" 3
"Lidocaine" "lye(') doe kane" 4
"Mecloqualone" "me kloe kwah(') lone" 5
"Mefloquine" "me(') floe kwin" 4
"Mephenytoin" "me fen(') i toyn" 5
"Mequidox" "me(') kwi dox" 4
"Meteneprost" "me teen(') prost" 4
"Methaqualone" "meth a(') kwa lone" 5
"Methetoin" "meth(') e toyn" 4
"Metoquizine" "me toe(') kwi zeen" 5
"Modecainide" "moe de kane(') ide" 5
"Monoctanoin" "mon ok(') ta noyn" 5
"Nequinate" "ne kwin(') ate" 4
"Neutramycin" "nyoo tra mye(') sin" 5
"Nifurquinazol" "nye fyoor kwin(') a zole" 6
"Nitrofurantoin" "nye troe fyoor an(') toyn" 6
"Orgotein" "or(') goe teen" 4
"Oxamniquine" "ox am(') ni kwin" 5
"Oxethazaine" "ox eth(') a zane" 5
"Oxyquinoline" "ox i kwin(') oh leen" 6
"Pamaqueside" "pa ma(') kwe side" 5
"Parathyroid" "par a thye(') roid" 5
"Paulomycin" "pau loe mye(') sin" 5
"Pegorgotein" "peg or(') go tein" 5
"Phenolphthalein" "fee nole thay(') leen" 5
"Phenolsulfonphthalein" "fee nole sul fon thay(') leen" 7
"Phenprocoumon" "fen proe koo(') mon" 5
"Phenytoin" "fen(') i toyn" 4
"Pirquinozol" "per kwin(') oh zole" 5
"Plauracin" "plaw(') ra sin" 4
"Poligeenan" "pol i gee(') nan" 5
"Praziquantel" "pray zi kwon(') tel" 5
"Prilocaine" "pril(') oh kane" 4
"Proquazone" "proe(') kwa zone" 4
"Proquinolate" "proe kwin(') oh late" 5
"Pyrrocaine" "peer(') oh kane" 4
"Quazepam" "kway(') ze pam" 4
"Quazinone" "kway(') zi none" 4
"Quazodine" "kway(') zoe deen" 4
"Quazolast" "kway zole(') ast" 4
"Quilostigmine" "kwil oh stig(') meen" 5
"Quinaprilat" "kwin(') a pril at" 5
"Quinbolone" "kwin(') boe lone" 4
"Quinestrol" "kwin es(') trole" 4
"Quinethazone" "kwin eth(') a zone" 5
"Quinetolate" "kwin et(') oh late" 5
"Quinfamide" "kwin(') fa mide" 4
"Quingestrone" "kwin jes(') trone" 4
"Quinupristin" "kwi nyoo(') pris tin" 5
"Risocaine" "rize(') oh kane" 4
"Rodocaine" "roe(') doe kane" 4
"Roquinimex" "roe kwin(') i mex" 5
"Sennosides" "sen(') oh sides" 4
"Squalane" "skwah(') lane" 3
"Sulfaquinoxaline" "sul fa kwin ox(') a leen" 7
"Sutilains" "soo(') ti lains" 4
"Tebuquine" "te(') bu kwin" 4
"Teceleukin" "te see loo(') kin" 5
"Teicoplanin" "tye koe plan(') in" 5
"Tetracaine" "tet(') ra kane" 4
"Tetroquinone" "te troe kwi none(')" 5
"Thioguanine" "thye oh gwah(') neen" 5
"Thyroid" "thye(') roid" 3
"Tiqueside" "tye(') kwe side" 4
"Tocainide" "toe kay(') nide" 4
"Toquizine" "toe(') kwi zeen" 4
"Transcainide" "trans kane(') ide" 4
"Tretinoin" "tret(') i noyn" 4
"Trichloromonofluoromethane" "trye klor oh mon oh flure oh meth(') ane" 10
"Tricitrates" "trye si(') trates" 4
"Trikates" "trye(') kates" 3
"Trisulfapyrimidines" "trye sul fa peer i(') mi deens" 8
"Virginiamycin" "vir jin ya mye(') sin" 6
"Zileuton" "zye loo(') ton" 4
"Zinc-Eugenol" "yoo(') je nole" 5
"Zucapsaicin" "zoo kap say(') sin" 5

Bruce Lambert, PhD
Department of Pharmacy Administration
University of Illinois at Chicago
833 S. Wood St. (M/C 871)
Chicago, IL 60612-7231

phone: 312-996-2411
fax: 312-996-0868