Nuevo Corpus: "1997 Spanish Broadcast News"
Carlos Subirats Rüggeberg
Carlos.Subirats at uab.es
Wed Jul 15 17:46:58 UTC 1998
INFOLING Lista moderada de lingüística española
http://listserv.rediris.es/archives/infoling.html
Envío de información: INFOLING at listserv.rediris.es
Editor: Carlos Subirats Rüggeberg <Carlos.Subirats at uab.es>
Colaboradoras:
Paola Bentivoglio <pbentivo at reacciun.ve>, UCV
Eulalia de Bobes <ebobes at seneca.uab.es>, UAB
Mar Cruz <mcruz at lingua.fil.ub.es>, UB
Emma Martinell <martinell at lingua.fil.ub.es>, UB
____________________________________________________________
Nuevo Corpus: "1997 Spanish Broadcast News"
Información proporcionada por:
Observatorio Español de Industrias de la Lengua
<oeil at cervantes.es>
http://www.cervantes.es/oeil/Oeil0.htm
____________________________________________________________
NEW RELEASE from the Linguistic Data Consortium
1997 Spanish Broadcast News (HUB-4NE)
This corpus contains a portion of the acoustic data
designated as the training set for the 1997 DARPA HUB-4
Spanish Benchmark. It contains speech and transcripts of
30 hours of broadcast news from the following sources:
VOA
Univision
Televisa
All acoustic files are in NIST SPHERE format, without
compression. The sample data are 16-bit linear PCM, 16-KHz
sample frequency, single channel. Most files contain 30
minutes of recorded material, and some contain 60 or 120
minutes (approximately); the sampling format requires
roughly 2 megabytes (MB) per minute of recording, so the
file sizes are typically around 60 MB, with some files
ranging up to 120 or 240 MB.
The transcripts are in SGML format, using the same
markup conventions that have been applied to the other 1997
Broadcast News speech corpora (in English and Mandarin),
and are transmitted by ftp, not on the cdroms with speech
data.
Because of restrictions imposed by the copyright
holders, this corpus is available to 1998 LDC members only.
If you would like to order a copy of this corpus,
please email your request to:
ldc at unagi.cis.upenn.edu
If you need additional information before placing your
order, or would like to inquire about membership in the
LDC, please send email or call (215)898-0464.
Further information about the LDC and its available corpora
can be accessed on the Linguistic Data Consortium WWW Home
Page at URL:
http://www.ldc.upenn.edu
Information is also available via ftp at ftp.cis.upenn.edu
under pub/ldc; for ftp access, please use "anonymous" as
your login name, and give your email address when asked for
password.
----------------------------------------------------
Formatos para enviar informacion a INFOLING.
Enviar a LISTSERV at LISTSERV.REDIRIS.ES
la orden: INFO INFOLING
----------------------------------------------------
More information about the Infoling
mailing list