Corpora: ELRA news

Magali Duclaux duclaux at elda.fr
Mon May 14 08:23:22 UTC 2001


[ We apologise for the duplicate posting of this announcement ]

***************************************************************************
ELRA
European Language Resources Association
ELRA News
****************************************************************************
We are happy to announce a new resource available via ELRA:

ELRA S0106      Dutch SpeechDat(II) MDB-250

A description of this database is given below.

The Dutch SpeechDat(II) MDB-250 comprises 250 Dutch speakers (125 males,
125 females) recorded over the Dutch mobile telephone network. The
recordings were made at SPEX, the Netherlands, and the recording
application was developed and run with Show 'N Tel. This database is
partitioned into 5 CDs The speech databases made within the SpeechDat(II)
project were validated by SPEX to assess their compliance with the
SpeechDat format and content specifications.
Speech samples are stored as sequences of 8-bit 8 kHz A-law. Each prompted
utterance is stored in a separate file. Each signal file is accompanied by
an ASCII SAM label file which contains the relevant descriptive information.

The following items were recorded:

8 application words (2 optional); 2 isolated digits; 1 sequence of 10
isolated digits; 3 connected digits: 1 telephone number (1-10 digits), 1
credit card number (1-16 digits), 1 digit PIN code (6 digits); 3 dates: 1
spontaneous date, 1 date, 1 relative date expression;
1 embedded application word; 3 spelled words: 1 forename (spontaneous), 1
city name, 1 word; 1 currency money amount; 1 natural number; 6 directory
assistance names: 1 forename (spontaneous), 1 city of birth, 1 most
frequent city, 1 city name, 1 company name, 1 forename surname; 2 yes/no
questions: 1 predominantly "yes" question, 1 predominantly "no" question; 9
phonetically rich sentences; 2 time phrases: 1 time of day (spontaneous), 1
time phrase; 4 phonetically rich words.

The following age distribution has been obtained: 5 speakers are under 16,
90 are between 16 and 30,
89 between 31 and 45, 56 between 46 and 60, and 10 are over 60. The lexicon
was created following
the guidelines in SD1.3.1 v4.3.

=====================================
For further information, please contact:
ELRA/ELDA Tel +33 01 43 13 33 33
55-57 rue Brillat-Savarin Fax +33 01 43 13 33 30
F-75013 Paris, France E-mail mapelli at elda.fr
or visit the online catalogue on our Web site:
http://www.icp.grenet.fr/ELRA/home.html
or http://www.elda.fr
=====================================



More information about the Corpora mailing list