[Corpora-List] Speech Corpus for Neural Network Training

From: Scott Drellishak (sfd@u.washington.edu)
Date: Sat Jun 26 2004 - 13:47:28 MET DST

  • Next message: Mahtab Nikkhou: "[Corpora-List] NEMLAR Arabic Language Resources and Tools conference - Early bird registration: 15th July 2004"

    [I posted this recently to the Linguist List, and a colleague suggested I
    ought to try posting it here as well.]

    I am involved in a research project whose goal is to produce a software
    system for the control of electronic devices using continuous variables
    extracted from human speech. Part of this system will be a neural network
    that recognizes various vowels and produces tracks of pitch and formant
    frequencies. Training the neural network will require a large amount of
    data that we're hoping to get from an existing corpus, rather than creating
    it ourselves.

    We are looking for a corpus that contains samples of many speakers producing
    many vowels (preferably in a less reduced register) that also contains
    human-validated pitch and formant (F1, F2, and F3) tracks and, if possible,
    bandwidth information. A corpus that contains more than just vowels is
    fine, since we can discard sections of the samples that do not suit our
    needs.

    If anyone knows of a corpus like this, either freely distributed or
    requiring a fee, I would like to know how to get ahold of it.

    I will post a summary of the replies that I receive. Thanks in advance for
    your time.

    Scott Drellishak
    University of Washington
    Seattle, WA



    This archive was generated by hypermail 2b29 : Sat Jun 26 2004 - 14:19:54 MET DST