<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">
</head>
<body text="#000000" bgcolor="#ffffff">
<p class="MsoNormal" align="center"><i>In this newsletter:</i><b
style=""><br>
</b></p>
<p class="MsoNormal" align="center"><a href="#scholar"><b style="">Spring
2012 LDC Data Scholarship Program</b></a></p>
<div align="center"> </div>
<p class="MsoNormal" align="center"><a href="#my"><b style="">Invitation
to Join for Membership Year (MY) 2012</b></a></p>
<div align="center"> </div>
<div align="center">
<blockquote><i>New publications:</i><br>
</blockquote>
</div>
<p class="MsoNormal" align="center"><a href="#sre"><b style="">2006
NIST Speaker Recognition Evaluation Training Set</b></a></p>
<div align="center"> </div>
<p class="MsoNormal" align="center"><a href="#vace"><b style="">2006
NIST/USF Evaluation Resources for the VACE Program - Meeting
Data Test Set Part 2</b></a></p>
<div align="center"> </div>
<p class="MsoNormal" align="center"><a href="#gig"><b style="">Chinese
Gigaword Fifth Edition</b></a></p>
<hr width="100%" size="2">
<p class="MsoNormal" align="center"> <a name="scholar"></a><b
style="">Spring 2012 LDC Data Scholarship Program</b></p>
<p class="MsoNormal" align="center"><br>
</p>
<div align="left"> Applications are now being accepted through
January 15, 2012 for the Spring 2012 LDC Data Scholarship
program! The LDC Data Scholarship program provides university
students with access to LDC data at no-cost. <span style=""> </span>This
program is open to students pursuing both undergraduate and
graduate studies in an accredited college or university. LDC Data
Scholarships are not restricted to any particular field of study;
however, students must demonstrate a well-developed research
agenda and a bona fide inability to pay. The selection process is
highly competitive. <br>
<br>
The application consists of two parts: <br>
<br>
(1) <b style="">Data Use Proposal</b>. Applicants must submit a
proposal describing their intended use of the data. The proposal
must contain the applicant's name, university, and field of study.
The proposal should state which data the student plans to use and
contain a description of their research project. <br>
<br>
Applicants should consult the <a
href="http://www.ldc.upenn.edu/Catalog/index.jsp">LDC Corpus
Catalog</a> for a complete list of data distributed by LDC. Due
to certain restrictions, a handful of LDC corpora are restricted
to members of the Consortium. Applicants are advised to select a
maximum of one to two data sets; students may apply for additional
data sets during the following cycle once they have completed
processing of the initial data sets and publish or present work in
some juried venue.<br>
<br>
(2) <b style="">Letter of Support</b>. Applicants must submit one
letter of support from their thesis adviser or department chair.
The letter must confirm that the department or university lacks
the funding to pay the full Non-member Fee for the data and verify
the student's need for data.<br>
<br>
For further information on application materials and program
rules, please visit the <a
href="http://www.ldc.upenn.edu/About/scholarships.html">LDC Data
Scholarship</a> page. <br>
<br>
Students can email their applications to the <a
href="mailto:datascholarships@ldc.upenn.edu">LDC Data
Scholarship program</a>. Decisions will be sent by email from
the same address.<br>
<br>
The deadline for the Spring 2012 program cycle is January 15,
2012.<br>
<br>
<br>
<div align="center"><a name="my"></a><b style="">Invitation to
join for Membership Year (MY) 2012</b><br>
</div>
<p class="MsoNormal"> <br>
<a style="">Membership Year (MY) 2012, our 20th Anniversary
Year, is open for joining! We would like to invite all
current and previous members of LDC to renew their membership
as well as welcome new organizations to join the consortium.</a><span
class="MsoCommentReference"><span style="font-size: 8pt;
line-height: 115%;"><span style=""> </span></span></span>
For MY2012, LDC is pleased to maintain membership fees at last
year’s rates – membership fees will not increase. Additionally,
LDC will extend discounts on membership fees to members who keep
their membership current and who join early in the year.<br>
<br>
The details of our early renewal discounts for MY2012 are as
follows: </p>
<p class="MsoListParagraphCxSpFirst" style="margin-left: 1in;
text-indent: -0.25in;"><span style="font-family: Symbol;"><span
style="">·<span style="font: 7pt "Times New
Roman";"> </span></span></span>Organizations
who joined for MY2011 will receive a 5% discount when renewing.
This discount will apply throughout 2012, regardless of time of
renewal. MY2011 members renewing before March 1, 2012 will
receive an additional 5% discount, for a total 10% discount off
the membership fee.</p>
<p class="MsoListParagraphCxSpLast" style="margin-left: 1in;
text-indent: -0.25in;"><span style="font-family: Symbol;"><span
style="">·<span style="font: 7pt "Times New
Roman";"> </span></span></span>New members
as well as organizations who did not join for MY2011, but who
held membership in any of the previous MYs (1993-2010), will
also be eligible for a 5% discount provided that they join/renew
before March 1, 2012.</p>
<p class="MsoNormal">The following table provides exact pricing
information. </p>
<p class="MsoNormal"> </p>
<div align="center">
<table class="MsoNormalTable" style="border: 1.5pt double
windowtext;" width="585" border="1" cellpadding="0"
height="270">
<tbody>
<tr style="">
<td colspan="2" style="border: 1pt solid windowtext;
padding: 0in;">
<p class="MsoNormal"> </p>
</td>
<td style="border: 1pt solid windowtext; padding: 0.75pt;">
<p class="MsoNormal"><b style="">MY2012 Fee </b></p>
</td>
<td style="border: 1pt solid windowtext; padding: 0.75pt;">
<p class="MsoNormal"><b style="">MY2012 Fee<br>
with 5% Discount* </b></p>
</td>
<td style="border: 1pt solid windowtext; padding: 0.75pt;">
<p class="MsoNormal"><b style="">MY2012 Fee <br>
with 10% Discount** </b></p>
</td>
</tr>
<tr style="">
<td colspan="2" style="border: 1pt solid windowtext;
padding: 0in;">
<p class="MsoNormal"><b style="">Not-for-Profit/US
Government </b></p>
</td>
<td style="border: 1pt solid windowtext; padding: 0.75pt;">
<p class="MsoNormal"> </p>
</td>
<td style="border: 1pt solid windowtext; padding: 0.75pt;">
<p class="MsoNormal"> </p>
</td>
<td style="border: 1pt solid windowtext; padding: 0.75pt;">
<p class="MsoNormal"> </p>
</td>
</tr>
<tr style="">
<td style="border: 1pt solid windowtext; padding: 0.75pt;">
<p class="MsoNormal"> </p>
</td>
<td style="border: 1pt solid windowtext; padding: 0.75pt;">
<p class="MsoNormal">Standard </p>
</td>
<td style="border: 1pt solid windowtext; padding: 0.75pt;">
<p class="MsoNormal">US$2400 </p>
</td>
<td style="border: 1pt solid windowtext; padding: 0.75pt;">
<p class="MsoNormal">US$2280 </p>
</td>
<td style="border: 1pt solid windowtext; padding: 0.75pt;">
<p class="MsoNormal">US$2160 </p>
</td>
</tr>
<tr style="">
<td style="border: 1pt solid windowtext; padding: 0.75pt;">
<p class="MsoNormal"> </p>
</td>
<td style="border: 1pt solid windowtext; padding: 0.75pt;">
<p class="MsoNormal">Subscription </p>
</td>
<td style="border: 1pt solid windowtext; padding: 0.75pt;">
<p class="MsoNormal">US$3850 </p>
</td>
<td style="border: 1pt solid windowtext; padding: 0.75pt;">
<p class="MsoNormal">US$3658 </p>
</td>
<td style="border: 1pt solid windowtext; padding: 0.75pt;">
<p class="MsoNormal">US$3465 </p>
</td>
</tr>
<tr style="">
<td colspan="2" style="border: 1pt solid windowtext;
padding: 0in;">
<p class="MsoNormal"><b style="">For-Profit </b></p>
</td>
<td style="border: 1pt solid windowtext; padding: 0.75pt;">
<p class="MsoNormal"> </p>
</td>
<td style="border: 1pt solid windowtext; padding: 0.75pt;">
<p class="MsoNormal"> </p>
</td>
<td style="border: 1pt solid windowtext; padding: 0.75pt;">
<p class="MsoNormal"> </p>
</td>
</tr>
<tr style="">
<td style="border: 1pt solid windowtext; padding: 0.75pt;">
<p class="MsoNormal"> </p>
</td>
<td style="border: 1pt solid windowtext; padding: 0.75pt;">
<p class="MsoNormal">Standard </p>
</td>
<td style="border: 1pt solid windowtext; padding: 0.75pt;">
<p class="MsoNormal">US$24000 </p>
</td>
<td style="border: 1pt solid windowtext; padding: 0.75pt;">
<p class="MsoNormal">US$22800 </p>
</td>
<td style="border: 1pt solid windowtext; padding: 0.75pt;">
<p class="MsoNormal">US$21600 </p>
</td>
</tr>
<tr style="">
<td style="border: 1pt solid windowtext; padding: 0.75pt;">
<p class="MsoNormal"> </p>
</td>
<td style="border: 1pt solid windowtext; padding: 0.75pt;">
<p class="MsoNormal">Subscription </p>
</td>
<td style="border: 1pt solid windowtext; padding: 0.75pt;">
<p class="MsoNormal">US$27500 </p>
</td>
<td style="border: 1pt solid windowtext; padding: 0.75pt;">
<p class="MsoNormal">US$26125 </p>
</td>
<td style="border: 1pt solid windowtext; padding: 0.75pt;">
<p class="MsoNormal">US$24750 </p>
</td>
</tr>
</tbody>
</table>
</div>
<p class="MsoNormal"><br>
* For new members, MY2011 Members renewing for MY2012, and any
previous year Member who renews before March 1, 2012<br>
<br>
** For MY2011 Members renewing before March 1, 2012<br>
<br>
<br>
Publications for MY2012 are still being planned; here are the
working titles of data sets we intend to provide:</p>
<table class="MsoNormalTable" style="width: 397.5pt;" width="530"
border="1" cellpadding="0">
<tbody>
<tr style="height: 3.75pt;">
<td style="width: 300pt; padding: 1.5pt; height: 3.75pt;"
valign="top" width="400">
<p class="MsoListParagraphCxSpFirst" style="text-indent:
-0.25in;"><span style="font-family: Symbol;"><span
style="">·<span style=""> </span></span></span>ARRAU
1.2 (Anaphor Resolution and Underspecification)</p>
</td>
<td style="width: 300pt; padding: 1.5pt; height: 3.75pt;"
valign="top" width="400">
<p class="MsoListParagraphCxSpLast" style="text-indent:
-0.25in;"><span style="font-family: Symbol;"><span
style="">·<span style=""> </span></span></span>TORGO
Dysarthic Speech</p>
</td>
</tr>
<tr style="height: 3.75pt;">
<td style="width: 300pt; padding: 1.5pt; height: 3.75pt;"
valign="top" width="400">
<p class="MsoListParagraphCxSpFirst" style="text-indent:
-0.25in;"><span style="font-family: Symbol;"><span
style="">·<span style=""> </span></span></span>Arabic
Treebank BN (broadcast news)</p>
</td>
<td style="width: 300pt; padding: 1.5pt; height: 3.75pt;"
valign="top" width="400">
<p class="MsoListParagraphCxSpLast" style="text-indent:
-0.25in;"><span style="font-family: Symbol;"><span
style="">·<span style=""> </span></span></span>GALE
data – all phases and tasks</p>
</td>
</tr>
<tr style="height: 3.75pt;">
<td style="width: 300pt; padding: 1.5pt; height: 3.75pt;"
valign="top" width="400">
<p class="MsoListParagraphCxSpFirst" style="text-indent:
-0.25in;"><span style="font-family: Symbol;"><span
style="">·<span style=""> </span></span></span>Digital
Archive of Southern Speech</p>
</td>
<td style="width: 300pt; padding: 1.5pt; height: 3.75pt;"
valign="top" width="400">
<p class="MsoListParagraphCxSpLast" style="text-indent:
-0.25in;"><span style="font-family: Symbol;"><span
style="">·<span style=""> </span></span></span>Chinese
Dependency Treebank</p>
</td>
</tr>
</tbody>
</table>
<br>
In addition to receiving new publications, current year members of
the LDC also enjoy the benefit of licensing older data at reduced
costs; current year for-profit members may use most data for
commercial applications.<br>
<br>
</div>
<div align="center"> </div>
<p class="MsoNormal" align="center"><b style="">New Publications</b></p>
<div align="center"> </div>
<p class="MsoNormal" align="center"> </p>
<p class="MsoNormal"><a name="sre"></a>(1) <a
href="http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2011S09">2006
NIST Speaker Recognition Evaluation Training Set</a> was
developed by LDC and NIST (National Institute of Standards and
Technology). It contains 595 hours of conversational telephone
speech in English, Arabic, Bengali, Chinese, Hindi, Korean,
Russian, Thai and Urdu and associated English transcripts used as
training data in the NIST-sponsored<a
href="http://www.itl.nist.gov/iad/mig/tests/spk/2006/index.html">
2006 Speaker Recognition Evaluation (SRE)</a>. The ongoing
series of SRE yearly evaluations conducted by NIST are intended to
be of interest to researchers working on the general problem of
text independent speaker recognition. </p>
<p class="MsoNormal">The task of the 2006 SRE evaluation was speaker
detection, that is, to determine whether a specified speaker is
speaking during a given segment of conversational telephone
speech. The task was divided into 15 distinct and separate tests
involving one of five training conditions and one of four test
conditions. Further information about the test conditions and
additional documentation is available at the <a
href="http://www.itl.nist.gov/iad/mig/tests/spk/2006/index.html">NIST
web site for the 2006 SRE</a> and within the <a
href="https://secure.ldc.upenn.edu/intranet/docs/LDC2011S09/sre-06_evalplan-v9.pdf">2006
SRE
Evaluation Plan</a>.</p>
<p class="MsoNormal">The speech data in this release was collected
by LDC as part of the <a
href="http://projects.ldc.upenn.edu/Mixer/">Mixer</a> project,
in particular Mixer Phases 1, 2 and 3. The Mixer project supports
the development of robust speaker recognition technology by
providing carefully collected and audited speech from a large pool
of speakers recorded simultaneously across numerous microphones
and in different communicative situations and/or in multiple
languages. The data is mostly English speech, but includes some
speech in the above languages</p>
<p class="MsoNormal">The telephone speech segments are multi-channel
data collected simultaneously from a number of auxiliary
microphones. The files are organized into three types: two-channel
excerpts of approximately 10 seconds, two-channel conversations of
approximately 5 minutes and summed-channel conversations also of
approximately 5 minutes.</p>
<p class="MsoNormal">English language transcripts in .ctm format
were produced using an automatic speech recognition (ASR) system.</p>
<br>
<p class="MsoNormal" align="center">*<br>
<br>
</p>
<p class="MsoNormal"><a name="vace"></a>(2) <a
href="http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2011V06">2006
NIST/USF Evaluation Resources for the VACE Program - Meeting
Data Test Set Part 2</a> was developed by researchers at the <a
href="http://www.cse.usf.edu/">Department of Computer Science
and Engineering</a>, University of South Florida (USF), Tampa,
Florida and the <a href="http://nist.gov/itl/iad/mig/">Multimodal
Information Group</a> at the National Institute of Standards and
Technology (NIST). It contains approximately twenty hours of
meeting room video data collected in 2005 and 2006 and annotated
for the VACE (Video Analysis and Content Extraction) 2006 face and
person tracking tasks.</p>
<p class="MsoNormal">The VACE program was established to develop
novel algorithms for automatic video content extraction,
multi-modal fusion, and event understanding. During VACE Phases I
and II, the program made significant progress in the automated
detection and tracking of moving objects including faces, hands,
people, vehicles and text in four primary video domains: broadcast
news, meetings, street surveillance, and unmanned aerial vehicle
motion imagery. Initial results were also obtained on automatic
analysis of human activities and understanding of video sequences.
</p>
<p class="MsoNormal">Three performance evaluations were conducted
under the auspices of the VACE program between 2004 and 2007. In
2006, the VACE program and the European Union's <a
href="http://gps-tsc.upc.es/imatge/_JosepRamon/CHIL/CHIL.html">Computers
in the Human Interaction Loop (CHIL)</a> collaborated to hold
the <a href="http://clear-evaluation.org/clear06/">Classification
of Events, Activities and Relationships (CLEAR) Evaluation</a>.
This was an international effort to evaluate systems designed to
analyze people, their identities, activities, interactions and
relationships in human-human interaction scenarios, as well as
related scenarios. The VACE program contributed the evaluation
infrastructure (e.g., data, scoring, tools) for a specific set of
tasks, and the CHIL consortium, coordinated by the <a
href="http://www.kit.edu/english/index.php">Karlsruhe Institute
of Technology</a>, contributed a separate set of evaluation
infrastructure. To the extent possible, the VACE and CHIL programs
harmonized their evaluation protocols and metrics.</p>
<p class="MsoNormal">The meeting room data used for the 2006 test
set was collected by the following sites in 2005 and 2006:
Carnegie Mellon University (USA), University of Edinburgh
(Scotland), IDIAP Research Institute (Switzerland), NIST (USA),
Netherlands Organization for Applied Scientific Research
(Netherlands) and Virginia Polytechnic Institute and State
University (USA). </p>
<br>
<p class="MsoNormal" align="center">*<br>
</p>
<p class="MsoNormal"><a name="gig"></a>(3) <a
href="http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2011T13">Chinese
Gigaword Fifth Edition</a> was produced by LDC. It is a
comprehensive archive of newswire text data that has been acquired
from Chinese news sources by LDC at the University of
Pennsylvania. Chinese Gigaword Fifth Edition includes all of the
content of the fourth edition of Chinese Gigaword (<a
href="http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2009T27">LDC2009T27</a>)
plus new data covering the period from January 2009 through
December 2010.</p>
<p class="MsoNormal">Eight distinct sources of Chinese newswire are
represented here:</p>
<ul>
<li><span style="font-family: Symbol;"><span style="">·<span
style="font: 7pt "Times New Roman";"> </span></span></span>Agence
France Presse(afp_cmn)</li>
<li><span style="font-family: Symbol;"><span style="">·<span
style="font: 7pt "Times New Roman";"> </span></span></span>Central
News Agency, Taiwan(cna_cmn)</li>
<li><span style="font-family: Symbol;"><span style="">·<span
style="font: 7pt "Times New Roman";"> </span></span></span>Central
News Service(cns_cmn)</li>
<li><span style="font-family: Symbol;"><span style="">·<span
style="font: 7pt "Times New Roman";"> </span></span></span>Guangming
Daily(gmw_cmn)</li>
<li><span style="font-family: Symbol;"><span style="">·<span
style="font: 7pt "Times New Roman";"> </span></span></span>People's
Daily(pda_cmn)</li>
<li><span style="font-family: Symbol;"><span style="">·<span
style="font: 7pt "Times New Roman";"> </span></span></span>People's
Liberation Army Daily(pla_cmn)</li>
<li><span style="font-family: Symbol;"><span style="">·<span
style="font: 7pt "Times New Roman";"> </span></span></span>Xinhua
News Agency(xin_cmn)</li>
<li><span style="font-family: Symbol;"><span style="">·<span
style="font: 7pt "Times New Roman";"> </span></span></span>Zaobao
Newspaper(zbn_cmn)</li>
</ul>
<p class="MsoNormal">The seven-letter codes in the parentheses above
are used for the directory names and data files for each source.<span
style=""> </span>Articles covering the period from January
2009 through December 2010 have been added to the Agence France
Presse, Central News Agency (CNA), Central News Service, Guangming
Daily, People's Liberation Army Daily and Xinhua News Agency data
sets. The data from People's Daily covers the period from late
June 2009 through December 2010. No new data from Zaobao has been
added. </p>
<br>
<hr width="100%" size="2">
<pre class="moz-signature" cols="72">Ilya Ahtaridis
Membership Coordinator
--------------------------------------------------------------------
Linguistic Data Consortium Phone: 1 (215) 573-1275
University of Pennsylvania Fax: 1 (215) 573-2175
3600 Market St., Suite 810 <a class="moz-txt-link-abbreviated" href="mailto:ldc@ldc.upenn.edu">ldc@ldc.upenn.edu</a>
Philadelphia, PA 19104 USA <a class="moz-txt-link-freetext" href="http://www.ldc.upenn.edu">http://www.ldc.upenn.edu</a>
</pre>
<pre class="moz-signature" cols="72">
</pre>
</body>
</html>