[Corpora-List] Speech Corpus for Neural Network Training
Scott Drellishak
sfd at u.washington.edu
Sat Jun 26 11:47:28 UTC 2004
[I posted this recently to the Linguist List, and a colleague suggested I
ought to try posting it here as well.]
I am involved in a research project whose goal is to produce a software
system for the control of electronic devices using continuous variables
extracted from human speech. Part of this system will be a neural network
that recognizes various vowels and produces tracks of pitch and formant
frequencies. Training the neural network will require a large amount of
data that we're hoping to get from an existing corpus, rather than creating
it ourselves.
We are looking for a corpus that contains samples of many speakers producing
many vowels (preferably in a less reduced register) that also contains
human-validated pitch and formant (F1, F2, and F3) tracks and, if possible,
bandwidth information. A corpus that contains more than just vowels is
fine, since we can discard sections of the samples that do not suit our
needs.
If anyone knows of a corpus like this, either freely distributed or
requiring a fee, I would like to know how to get ahold of it.
I will post a summary of the replies that I receive. Thanks in advance for
your time.
Scott Drellishak
University of Washington
Seattle, WA
More information about the Corpora
mailing list