Corpora: Capacity of a namespace

Bruce L. Lambert (lambertb@uic.edu)
Tue, 23 Feb 1999 17:46:17 -0600

Hi Folks,

I wonder if anyone out there could help shed light on the following question:

Given the 26-letter English alphabet and a word of a given length L, how
many phonologically legal, pronounceable names can be constructed?

This question, as usual, is motivated by my interest in minimizing
confusion between drug names. There are, for example, about 33,000
trademark drug names registered in the US. The modal name has 8 characters
in it. How many more names of a given length can 'fit' in the namespace
before it reaches capacity?

The number I'm looking for is the theoretical upper-limit. Obviously, long
before the namespace was completely 'filled', the rate of confusion would
become unacceptably high.

Thanks,

Bruce

Bruce Lambert, PhD
Department of Pharmacy Administration
University of Illinois at Chicago
833 S. Wood St. (M/C 871)
Chicago, IL 60612-7231

phone: 312-996-2411
fax: 312-996-0868