<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

<head>

  <meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">

</head>

<body bgcolor="#ffffff" text="#000000">

<div align="center">The Linguistic Data

Consortium (LDC) would like to

announce the availability of

three new publications.<br>

<br>

LDC2007S02<br>

<a

 href="http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2007S02"><b>Fisher

Levantine Arabic Conversational Telephone Speech</b></a><br>

<br>

LDC2007T04<br>

<a

 href="http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2007T04"><b>Fisher

Levantine Arabic Conversational Telephone Speech, Transcripts</b></a><br>

<br>

LDC2007V01<br>

<a

 href="http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2007V01"><b>TRECVID

2005 Keyframes & Transcripts</b></a><br>

</div>

<br>

<hr size="2" width="100%">

<br>

(1) <a

 href="http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2007S02">Fisher

Levantine Arabic Conversational Telephone Speech</a> contains 279

conversations totaling 45 hours of speech.  Levantine Arabic is spoken

along the western Mediterranean coast from Anatolia to the Sinai

Peninsula and encompasses the local dialects of Lebanon, Syria and

Palestine. There are two distinct varieties: Northern, centered around

Syria and Lebanon; and Southern, spoken in Jordan and Palestine.  The

majority of speakers in Fisher Levantine Arabic Conversational

Telephone Speech are from Jordan, Lebanon, and Palestine.<br>

<br>

The conversations in this corpus are a subset of the conversations in <a

 href="http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2006S29">Levantine

Arabic QT Training Data Set 5, Speech</a>, LDC2006S29. The individual

audio files are in NIST SPHERE format. The corresponding transcripts

may

be found in <a

 href="http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2007T04">Fisher

Levantine Arabic Conversational Telephone Speech, Transcripts</a>,

LDC2007T04.  <br>

<div align="center">*<br>

</div>

<br>

(2) <a

 href="http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2007T04">Fisher

Levantine Arabic Conversational Telephone Speech, Transcripts</a>

contains the transcripts for the 279 telephone conversations in  <a

 href="http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2007S02">Fisher

Levantine Arabic Conversational Telephone Speech </a>, LDC2007S02. 

The transcripts were created with "green" and "yellow" layers using

LDC's Multi-Dialectal Transcription Tool (AMADAT). The green layer

seeks to anchor dialectal forms to similar or related Modern Standard

Arabic orothgraphy-based forms. The yellow layer is a more careful and

detailed transcription that adds functionally necessary vowels and

marks important sociolinguistic variations and morphophonemic features.

<br>

<br>

The green layer transcripts in this corpus are a subset of the

transcripts contained in <a

 href="http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2006T07">Levantine

Arabic QT Training Data Set 5, Transcripts</a>, LDC2006T07. The yellow

layer transcription was added in this release.  <br>

<br>

<div align="center">*<br>

</div>

<br>

(3) TREC Video Retrieval Evaluation (TRECVID) is sponsored by the

National Institute of Standards and Technology (NIST) to promote

progress in content-based retrieval from digital video via open,

metrics-based evaluation. The keyframes in <a

 href="http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2007V01">TRECVID

2005 Keyframes & Transcripts</a> were extracted for use in the NIST

TRECVID 2005 Evaluation.   The source data used were Arabic, Chinese

and English language broadcast programming collected in November 2004.<br>

<br>

TRECVID is a laboratory-style evaluation that attempts to model real

world situations or significant component tasks involved in such

situations. In 2005 there were four main tasks with associated tests: <br>

<br>

<ul>

  <li>shot boundary determination </li>

</ul>

<ul>

  <li>low-level feature extraction </li>

</ul>

<ul>

  <li>high-level feature extraction </li>

</ul>

<ul>

  <li>search (interactive, manual, and automatic) </li>

</ul>

<br>

Shots are fundamental units of video, useful for higher-level

processing. To create the master list of shots, the video was

segmented. The results of this pass are called subshots. Because the

master shot reference is designed for use in manual assessment, a

second pass over the segmentation was made to create the master shots

of at least 2 seconds in length. These master shots are the ones used

in submitting results for the feature and search tasks in the

evaluation. In the second pass, starting at the beginning of each file,

the subshots were aggregated, if necessary, until the current shot was

at least 2 seconds in duration, at which point the aggregation began

anew with the next subshot. <br>

<br>

The keyframes were selected by going to the middle frame of the shot

boundary, then parsing left and right of that frame to locate the

nearest I-Frame. This then became the keyframe and was extracted.

Keyframes have been provided at both the subshot (NRKF) and master shot

(RKF) levels. <br>

<br>

<hr size="2" width="100%"><br>

<div align="center"><small><font face="Courier New, Courier, monospace">Ilya

Ahtaridis<br>

Membership Coordinator</font></small><br>

--------------------------------------------------------------------

<font face="Courier New, Courier, monospace"><br>

</font></div>

<div align="center">

<pre class="moz-signature" cols="72"><b><small><font

 face="Courier New, Courier, monospace">

</font></small>Linguistic Data Consortium                     Phone: (215) 573-1275

University of Pennsylvania                       Fax: (215) 573-2175

3600 Market St., Suite 810                         <a

 class="moz-txt-link-abbreviated" href="mailto:ldc@ldc.upenn.edu">ldc@ldc.upenn.edu</a>

Philadelphia, PA 19104 USA                  <a

 class="moz-txt-link-freetext" href="http://www.ldc.upenn.edu">http://www.ldc.upenn.edu</a></b></pre>

</div>

</body>

</html>