15.2363, Sum: Speech Corpus for Neural Network Training

LINGUIST List linguist at linguistlist.org
Tue Aug 24 14:06:12 UTC 2004


LINGUIST List:  Vol-15-2363. Tue Aug 24 2004. ISSN: 1068-4875.

Subject: 15.2363, Sum: Speech Corpus for Neural Network Training

Moderators: Anthony Aristar, Wayne State U.<aristar at linguistlist.org>
            Helen Dry, Eastern Michigan U. <hdry at linguistlist.org>

Reviews (reviews at linguistlist.org):
	Sheila Collberg, U. of Arizona
	Terence Langendoen, U. of Arizona

Home Page:  http://linguistlist.org/

The LINGUIST List is funded by Eastern Michigan University, Wayne
State University, and donations from subscribers and publishers.

Editor for this issue: Megan Zdrojkowsky <megan at linguistlist.org>
 ==========================================================================
To post to LINGUIST, use our convenient web form at
http://linguistlist.org/LL/posttolinguist.html.
=================================Directory=================================

1)
Date:  Mon, 23 Aug 2004 21:18:19 -0400 (EDT)
From:  Scott Drellishak <sfd at u.washington.edu>
Subject:  Speech Corpus for Neural Network Training

-------------------------------- Message 1 -------------------------------

Date:  Mon, 23 Aug 2004 21:18:19 -0400 (EDT)
From:  Scott Drellishak <sfd at u.washington.edu>
Subject:  Speech Corpus for Neural Network Training

A few weeks ago, I posted a request for information about speech
corpora of a particular kind to both the Linguist List and the
Corpora-List (Linguist 15.1895).  This is the (somewhat belated) summary.

I described the corpora we are seeking as follows:

''We are looking for a corpus that contains samples of many speakers
producing many vowels (preferably in a less reduced register) that
also contains human-validated pitch and formant (F1, F2, and F3)
tracks and, if possible, bandwidth information.  A corpus that
contains more than just vowels is fine, since we can discard sections
of the samples that do not suit our needs.''

I received five replies:

1)  John Lawler suggested MICASE (Michigan Corpus of Academic
    Spoken English), which is available here:

    http://www.lsa.umich.edu/eli/micase/index.htm

2)  Lesley Carmichael suggested I post my request to the
    Corpora-List.

3)  Jane Edwards pointed me at the Switchboard Transcription
    Project:

    http://www.icsi.berkeley.edu/real/stp/index.html

4)  Susana Sotillo wrote, ''At a recent conference (CALICO) I
    saw a demonstration of the Speechcalator (Allen Blackwell
    and associates).  Why don't you write him at Carnegie-
    Mellon.''

5)  Linda Bawcom offered an hour and a half of taped
    conversation that she used in her MA research.

Many thanks to everyone who replied.

Scott Drellishak
University of Washington
Seattle, WA

---------------------------------------------------------------------------
LINGUIST List: Vol-15-2363



More information about the Linguist mailing list