Arabic-L:LING:more LDC resources
Dilworth Parkinson
dilworth_parkinson at BYU.EDU
Wed Mar 21 18:25:42 UTC 2007
------------------------------------------------------------------------
Arabic-L: Wed 21 Mar 2007
Moderator: Dilworth Parkinson <dilworth_parkinson at byu.edu>
[To post messages to the list, send them to arabic-l at byu.edu]
[To unsubscribe, send message from same address you subscribed from to
listserv at byu.edu with first line reading:
unsubscribe arabic-l ]
-------------------------Directory------------------------------------
1) Subject:more LDC resources
-------------------------Messages-----------------------------------
1)
Date: 21 Mar 2007
From:ldc at ldc.upenn.edu
Subject:more LDC resources
(1) Fisher Levantine Arabic Conversational Telephone Speech contains
279 conversations totaling 45 hours of speech. Levantine Arabic is
spoken along the western Mediterranean coast from Anatolia to the
Sinai Peninsula and encompasses the local dialects of Lebanon, Syria
and Palestine. There are two distinct varieties: Northern, centered
around Syria and Lebanon; and Southern, spoken in Jordan and
Palestine. The majority of speakers in Fisher Levantine Arabic
Conversational Telephone Speech are from Jordan, Lebanon, and Palestine.
The conversations in this corpus are a subset of the conversations in
Levantine Arabic QT Training Data Set 5, Speech, LDC2006S29. The
individual audio files are in NIST SPHERE format. The corresponding
transcripts may be found in Fisher Levantine Arabic Conversational
Telephone Speech, Transcripts, LDC2007T04. Fisher Levantine Arabic
Conversational Telephone Speech is distributed on one DVD-ROM.
2007 Subscription Members will automatically receive two copies of
this corpus. 2007 Standard Members may request a copy as part of
their 16 free membership corpora. Nonmembers may license this data
for US$1000.
*
(2) Fisher Levantine Arabic Conversational Telephone Speech,
Transcripts contains the transcripts for the 279 telephone
conversations in Fisher Levantine Arabic Conversational Telephone
Speech , LDC2007S02. The transcripts were created with "green" and
"yellow" layers using LDC's Multi-Dialectal Transcription Tool
(AMADAT). The green layer seeks to anchor dialectal forms to similar
or related Modern Standard Arabic orothgraphy-based forms. The yellow
layer is a more careful and detailed transcription that adds
functionally necessary vowels and marks important sociolinguistic
variations and morphophonemic features.
The green layer transcripts in this corpus are a subset of the
transcripts contained in Levantine Arabic QT Training Data Set 5,
Transcripts, LDC2006T07. The yellow layer transcription was added in
this release. Fisher Levantine Arabic Conversational Telephone
Speech, Transcripts is distributed via wed download.
2007 Subscription Members will automatically receive two copies of
this corpus on disc. 2007 Standard Members may request a copy as part
of their 16 free membership corpora. Nonmembers may license this data
for US$3000.
------------------------------------------------------------------------
--
End of Arabic-L: 21 Mar 2007
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/arabic-l/attachments/20070321/b77b4a3c/attachment.htm>
More information about the Arabic-l
mailing list