7.898, FYI: New Release from the Linguistic Data Consortium

The Linguist List linguist at tam2000.tamu.edu
Sat Jun 15 02:22:40 UTC 1996


---------------------------------------------------------------------------
LINGUIST List:  Vol-7-898. Fri Jun 14 1996. ISSN: 1068-4875. Lines:  75
 
Subject: 7.898, FYI: New Release from the Linguistic Data Consortium
 
Moderators: Anthony Rodrigues Aristar: Texas A&M U. <aristar at tam2000.tamu.edu>
            Helen Dry: Eastern Michigan U. <hdry at emunix.emich.edu> (On Leave)
            T. Daniel Seely: Eastern Michigan U. <dseely at emunix.emich.edu>
 
Associate Editor:  Ljuba Veselinova <lveselin at emunix.emich.edu>
Assistant Editors: Ron Reck <rreck at emunix.emich.edu>
                   Ann Dizdar <dizdar at tam2000.tamu.edu>
                   Annemarie Valdez <avaldez at emunix.emich.edu>
 
Software development: John H. Remmers <remmers at emunix.emich.edu>
 
Editor for this issue: aristar at tam2000.tamu.edu (Anthony Rodrigues Aristar)
 
---------------------------------Directory-----------------------------------
1)
Date:  Fri, 14 Jun 1996 16:59:38 EDT
From:  ldc at unagi.cis.upenn.edu (LDC Office)
Subject:  New Release from the LDC
 
---------------------------------Messages------------------------------------
1)
Date:  Fri, 14 Jun 1996 16:59:38 EDT
From:  ldc at unagi.cis.upenn.edu (LDC Office)
Subject:  New Release from the LDC
 
               Announcing a NEW RELEASE from the
                   LINGUISTIC DATA CONSORTIUM
 
	 Continuous Speech Recognition Corpus-IV (Hub-3)
 
This set of CD-ROMs contains all of the speech data provided to sites
participating in the DARPA CSR November 1995 Hub-3 Mulit-Microphone
tests.  The data consists of digitized waveforms collected with eight
different microphones simultaneously from 40 subjects reading 15
sentence articles drawn from various North American business news
publications.  The data is partitioned into development-test and
evaluation-test sets.  The test sets were collected with different
subjects, prompts, and microphones.  No training data was collected
for this corpus since a substantial amount of NAB acoustic training
data was already available.  Index files have been included that
specify the exact subset of the evaluation test recordings which were
used in the November 1995 tests.  The software NIST used to process
and score the outputof the tests systems is also included.
 
The data is organized as follows:
 
CD26-3 Development-Test Data-Location 1, Adaptation and NAB recordings,
			     Subjects:703-705, 707-70a, 70c, 70f, 70g
 
CD26-4 Development-Test Data-Location 2, NAB recordings,
			     Subjects:70k, 70m, 70o, 70q-70s, 70u-70w
 
CD26-5 Development-Test Data-Location 2, Adaptation recordings,
			     Subjects:70k 70m-70o, 70q-70s, 70u-70w
 
CD26-3 Development-Test Data-NAB recordings,
			     Subjects:710-71j
 
Because of restrictions imposed by the copyright holders of much of
the NAB text, this corpus is available to 1996 LDC members only.
Members who wish to receive this corpus must sign the CSR-IV
agreement.  This agreement is available on the Linguistic Data
Consortium WWW Home Page at URL http://www.cis.upenn.edu/~ldc.
 
If you would like to order a copy of this corpus, please email your
request to ldc at unagi.cis.upenn.edu. If you need additional information
before placing your order, or would like to inquire about membership
in the LDC, please send email or call (215) 898-0464.
------------------------------------------------------------------------
LINGUIST List: Vol-7-898.



More information about the LINGUIST mailing list