<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">
</head>
<body bgcolor="#ffffff" text="#000000">
<div class="moz-text-html" lang="x-western">
<div class="moz-text-html" lang="x-western">
<div align="center"> </div>
<p class="MsoNormal" align="center"><i style="">In this
newsletter:</i></p>
<div align="center"> </div>
<p class="MsoNormal" align="center"><b>- <a href="#scholar">Spring
2012 LDC Data Scholarship Program - deadline approaching!</a>
-</b></p>
<div align="center"> </div>
<p class="MsoNormal" align="center"><b style="">- </b><b><a
href="#lsa">LDC Exhibiting at LSA 2012 Annual Meeting</a></b><b
style=""> -<br>
</b></p>
<div align="center"> </div>
<p class="MsoNormal" align="center"><b style="">- </b><b><a
href="#workshop">LDC Hosts Satellite Workshop at LSA 2012</a>
-</b></p>
<div align="center"> </div>
<div align="center"> </div>
<p class="MsoNormal" align="center"><i style="">New
publications:</i></p>
<div align="center"> </div>
<p class="MsoNormal" align="center">LDC2011S10<br>
<b style="">- </b><b><a href="#2006">2006 NIST Speaker
Recognition Evaluation Test Set Part 1</a></b><b style="">
-<br>
</b></p>
<div align="center"> </div>
<p class="MsoNormal" align="center">LDC2011S11<br>
<b>- <a href="#2008">2008 NIST Speaker Recognition
Evaluation Supplemental Set</a> -</b></p>
<div class="MsoNormal" style="text-align: center;"
align="center">
<hr width="100%" align="center" size="2"></div>
<p class="MsoNormal" align="center"> <a name="scholar"></a><b
style="">Spring 2012 LDC Data Scholarship Program - deadline
fast approaching!</b></p>
<p class="MsoNormal">The deadline for the Spring 2012 LDC Data
Scholarship Program is less than a month away! Applications
are being accepted through January 15, 2012. The LDC Data
Scholarship program provides university students with access
to LDC data at no cost. This program is open to students
pursuing both undergraduate and graduate studies in an
accredited college or university. LDC Data Scholarships are
not restricted to any particular field of study; however,
students must demonstrate a well-developed research agenda and
a bona fide inability to pay. <br>
<br>
Students will need to complete an application which consists
of a data use proposal and letter of support from their
adviser. For further information on application materials and
program rules, please visit the <a
href="http://www.ldc.upenn.edu/About/scholarships.html">LDC
Data Scholarship</a> page. </p>
<p class="MsoNormal">Students can email their applications to
the <a href="mailto:datascholarships@ldc.upenn.edu">LDC Data
Scholarship program</a>. Decisions will be sent by email
from the same address.<br>
<br>
</p>
<div align="center"><a name="lsa"></a><b style="">LDC Exhibiting
at LSA 2012 Annual Meeting</b><br style="">
</div>
<p class="MsoNormal" style="margin-bottom: 0.0001pt;
line-height: normal;">LDC looks forward to mingling with
linguists and language specialists when we exhibit at the 86<sup>th</sup>
Annual Meeting of the Linguistic Society of America (LSA). The
main conference will be held over January 5-8, 2012 at the <a
href="http://www.tourhiltonportland.com/">Portland, OR
Hilton and Executive Tower</a> and the exhibit hall will be
open from January 6-8th (limited hours on Sunday the 8<sup>th</sup>).
Please stop by our display for news on what 2012 will hold for
LDC and to receive some of our conference giveaways.</p>
<p class="MsoNormal" style="margin-bottom: 0.0001pt;
line-height: normal;">LSA 2012 will feature plenary talks on
the following topics:<br>
</p>
<p class="MsoNormal" style="margin-bottom: 0.0001pt;
line-height: normal;"> </p>
<blockquote>
<ul>
<li> Patrice Speeter Beddor (University of Michigan):
"The Dynamics of Speech Perception: Constancy, Variation,
and Change"</li>
</ul>
<ul>
<li> Dan Jurafsky (Stanford University): "Computing
Meaning: Learning and Extracting Meaning from Text"</li>
</ul>
<ul>
<li> Ted Supalla (University of Rochester):
"Rethinking the Emergence of Grammatical Structure in
Signed Languages: New Evidence from Variation and
Historical Change in American Sign Language"</li>
</ul>
</blockquote>
For further information visit the <a
href="http://www.lsadc.org/info/meet-annual.cfm">LSA Annual
Meeting website</a>. If you would like to learn more about
LDC’s conference preparations, please ‘like’ our <a
href="http://www.facebook.com/ldc.upenn">Facebook</a> page.
<p class="MsoNormal" style="margin-bottom: 0.0001pt;
line-height: normal;">We hope to see you there!</p>
<p class="MsoNormal"><b style=""> </b><br>
</p>
<div align="center"><b style="">New Publications<br>
<br>
</b></div>
<p class="MsoNormal"><a name="2006sre"></a>(1) <a
href="http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2011S10">2006
NIST Speaker Recognition Evaluation Test Set Part 1</a> was
developed by LDC and National Institute of Standards and
Technology (NIST). It contains 437 hours of conversational
telephone and microphone speech in English, Arabic, Bengali,
Chinese, Farsi, Hindi, Korean, Russian, Spanish, Thai and Urdu
and associated English transcripts used as test data in the
NIST-sponsored<a
href="http://www.itl.nist.gov/iad/mig/tests/spk/2006/index.html">
2006 Speaker Recognition Evaluation (SRE)</a>. </p>
<p class="MsoNormal">The ongoing series of SRE yearly
evaluations conducted by NIST are intended to be of interest
to researchers working on the general problem of text
independent speaker recognition. The task of the 2006 SRE
evaluation was speaker detection, that is, to determine
whether a specified speaker is speaking during a given segment
of conversational telephone speech. The task was divided into
15 distinct and separate tests involving one of five training
conditions and one of four test conditions. Further
information about the test conditions and additional
documentation is available at the <a
href="http://www.itl.nist.gov/iad/mig/tests/spk/2006/index.html">NIST
web site for the 2006 SRE</a> and within the <a
href="https://secure.ldc.upenn.edu/intranet/docs/LDC2011S10/sre-06_evalplan-v9.pdf">2006
SRE
Evaluation Plan</a>.</p>
<p class="MsoNormal">The speech data in this release was
collected by LDC as part of the <a
href="http://projects.ldc.upenn.edu/Mixer/">Mixer</a>
project, in particular Mixer Phases 1, 2 and 3. The Mixer
project supports the development of robust speaker recognition
technology by providing carefully collected and audited speech
from a large pool of speakers recorded simultaneously across
numerous microphones and in different communicative situations
and/or in multiple languages. The data is mostly English
speech, but includes some speech in Arabic, Bengali, Chinese,
Farsi, Hindi, Korean, Russian, Spanish, Thai and Urdu.</p>
<p class="MsoNormal">The telephone speech segments are
multi-channel data collected simultaneously from a number of
auxiliary microphones. The files are organized into four
types: two-channel excerpts of approximately 10 seconds,
two-channel conversations of approximately 5 minutes,
summed-channel conversations also of approximately 5 minutes
and a two-channel conversation with the usual telephone speech
replaced by auxiliary microphone data in the putative target
speaker channel. The auxiliary microphone conversations are
also of approximately five minutes in length.</p>
<p class="MsoNormal">English language transcripts in .ctm format
were produced using an automatic speech recognition (ASR)
system.</p>
<p class="MsoNormal"><br>
<br>
</p>
<div align="center"><b style="">*</b></div>
<p class="MsoNormal"><a name="2008sre"></a>(2) <a
href="http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2011S11">2008
NIST Speaker Recognition Evaluation Supplemental Set</a> was
developed by LDC and National Institute of Standards and
Technology (NIST) and contains additional data distributed
after the main 2008 Speaker Recognition Evaluation (SRE).
Specifically, the corpus consists of 770 hours of English
microphone speech along with transcripts and other materials
used as supplemental data in the <a
href="http://www.itl.nist.gov/iad/mig/tests/spk/2008/index.html">2008
NIST Speaker Recognition Evaluation (SRE)</a> and in a
follow-up evaluation to SRE08. </p>
<p class="MsoNormal">The 2008 evaluation was distinguished from
prior evaluations by including not only conversational
telephone speech data but also conversational speech data of
comparable duration recorded over a microphone channel
involving an interview scenario. The follow-up evaluation
focused on speaker detection in the context of conversational
interview type speech and was designed to measure the
performance of SRE08 systems in previously unexposed test
segment channel conditions.</p>
<p class="MsoNormal">LDC previously released the main 2008 NIST
SRE Evaluation in three parts as <a
href="http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2011S05">2008
NIST
Speaker Recognition Evaluation Training Set Part 1
LDC2011S05</a>, <a
href="http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2011S07">2008
NIST
Speaker Recognition Evaluation Training Set Part 2
LDC2011S07</a> and <a
href="http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2011S08">2008
NIST
Speaker Recognition Evaluation Test Set LDC2011S08</a>.</p>
<p class="MsoNormal">The speech data in this release was
collected in 2007 by LDC at its <a
href="http://www.ldc.upenn.edu/About/facilities.shtml">Human
Subjects Data Collection Laboratories</a> in Philadelphia
and by the <a href="http://www.icsi.berkeley.edu/">International
Computer Science Institute</a> (ICSI) at the University of
California, Berkeley. This collection was part of the <a
href="http://projects.ldc.upenn.edu/Mixer/">Mixer 5</a>
project, which was designed to support the development of
robust speaker recognition technology by providing carefully
collected and audited speech from a large pool of speakers
recorded simultaneously across numerous microphones and in
different communicative situations and/or in multiple
languages. Mixer participants were native English and
bilingual English speakers. The microphone speech in this
corpus is in English and consists of approximately 3 minute
and 30 minute interview excerpts. </p>
<p class="MsoNormal">This supplemental data is split into four
different parts which provide:</p>
<ul>
<li><span style="font-family: Symbol;"><span style=""><span
style="font: 7pt "Times New Roman";">
</span></span></span>new training data distributed to
2008 SRE participants</li>
<li><span style="font-family: Symbol;"><span style=""><span
style="font: 7pt "Times New Roman";">
</span></span></span>additional data distributed to
participants in the 2008 SRE follow-up evaluation</li>
<li><span style="font-family: Symbol;"><span style=""><span
style="font: 7pt "Times New Roman";">
</span></span></span>interviewer channel files for the
2008 SRE main test (released after the evaluations)</li>
<li><span style="font-family: Symbol;"><span style=""><span
style="font: 7pt "Times New Roman";">
</span></span></span>supplemental training data
(released after the evaluations)</li>
</ul>
<p class="MsoNormal">English language transcripts in .cfm format
were produced using an automatic speech recognition (ASR)
system and are included for some, but not all, speech data.</p>
<p class="MsoNormal"><br>
</p>
<hr width="100%" size="2">
<pre class="moz-signature" cols="72">Ilya Ahtaridis
Membership Coordinator
--------------------------------------------------------------------
Linguistic Data Consortium Phone: 1 (215) 573-1275
University of Pennsylvania Fax: 1 (215) 573-2175
3600 Market St., Suite 810 <a class="moz-txt-link-abbreviated" href="mailto:ldc@ldc.upenn.edu">ldc@ldc.upenn.edu</a>
Philadelphia, PA 19104 USA <a class="moz-txt-link-freetext" href="http://www.ldc.upenn.edu">http://www.ldc.upenn.edu</a>
</pre>
</div>
</div>
<pre class="moz-signature" cols="72">
</pre>
</body>
</html>