[Corpora-List] New Release from the LDC

Wed Aug 21 20:28:28 UTC 2002

       *   West Point Arabic Speech Corpus   *

The Linguistic Data Consortium (LDC) is pleased to announce the
availability of the West Point Arabic Speech Corpus.

This corpus contains speech data that was collected and processed
by members of the Department of Foreign Languages at the United
States Military Academy at West Point and the Center For Technology
Enhanced Language Learning (CTELL), as part of an effort called
'Project Santiago'.  The original purpose of this corpus was to
train acoustic models for automatic speech recognition that could
be used as an aid in teaching Arabic to West Point cadets.

The West Point Arabic Speech Corpus consists of 8,516 speech files,
totaling 1.7 gigabytes or 11.42 hours of speech data. Each speech
file represents one person reciting one prompt from one of four
prompt scripts.

For further information, including online documentation, please visit:

http://www.ldc.upenn.edu/Catalog/LDC2002S02.html

Institutions that have membership in the LDC during the 2002
Membership Year will be able to receive this corpus free of charge.
Nonmembers may purchase this publication for $600.

			   *

If you need additional information before placing your order, or
would like to inquire about membership in the LDC, please send email
to <ldc at ldc.upenn.edu> or call (215) 573-1275.

--------------------------------------------------------------------
Linguistic Data Consortium          Phone: (215) 573-1275
3615 Market Street                  Fax:   (215) 573-2175
Suite 200                           email: ldc at ldc.upenn.edu
Philadelphia, PA 19104-2608         www: http://www.ldc.upenn.edu