Re: TIMIT U.S. English speech corpus

Bill Fisher (billf@jaguar.ncsl.nist.gov)
Thu, 6 Mar 97 14:15:07 EST

On Mar 6, 2:06pm, James Salsman wrote:
> Subject: TIMIT U.S. English speech corpus
> I think I'm interested in purchasing a copy of the TIMIT CD-ROM
> speech corpus, but I'm worried about all the spelling errors
> (there appear to be hundereds at first glance.)
>
> What is the cost of that CD-ROM, and does anyone have a URL
> pointer to it's files for inspection please?
>

Several versions of the TIMIT corpus are available from
the Linguistic Data Consortium (LDC); a pointer to the catalog
entry for the original and basic one is

http://www.ldc.upenn.edu/ldc/catalog/html/speech_html/arpa.html

Their latest price is $100 for nonmembers.

It's also known as NIST Speech Disc CD1-1.1, and it's also
available from the National Technical Information Service (NTIS).
A set of hardcopy documentation for TIMIT is also available
from NTIS, as NTIS# PB91-100354. I don't know any more about
how to get it from them or what they charge, but their URL is

http://www.ntis.gov/

I don't know of any site that's made the text available;
it's a part of what we sell via LDC and NTIS. I'd be surprised
if there are hundereds of spelling errors, since it's been
scrubbed for years now.

- Bill Fisher / NIST