<html>
  <head>

    <meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <div align="center">
      <div align="left"><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";
          mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin"><a
            href="#scholar"><b>-  Spring 2014 LDC Data Scholarship
              Program</b></a></span>  -<br>
        <span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";
          mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin"></span><br>
        <span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";
          mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin"></span><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";
          mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin"><i>New





            publications:</i></span><b><a
            style="mso-comment-reference:dd_1;mso-comment-date:20131112T1735"><span
              style="font-size:12.0pt;mso-fareast-font-family:"Times

              New Roman";
              mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin"></span></a></b><br>
        <b><a
            style="mso-comment-reference:dd_1;mso-comment-date:20131112T1735"><span
              style="font-size:12.0pt;mso-fareast-font-family:"Times

              New Roman";
              mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin"><b>
              </b></span></a></b><br>
        <b><a
            style="mso-comment-reference:dd_1;mso-comment-date:20131112T1735"><span
              style="font-size:12.0pt;mso-fareast-font-family:"Times

              New Roman";
              mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin"><b>
              </b></span></a><a href="#ctb"><b><span
                style="font-size:12.0pt;mso-fareast-font-family:"Times
                New Roman";
                mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">- 


                Chinese Treebank 8.0  - </span></b></a></b><br>
        <b><a href="#ctb"><b><span
                style="font-size:12.0pt;mso-fareast-font-family:"Times
                New Roman";
                mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">
              </span></b></a></b><br>
        <b><a href="#ctb"><b><span
                style="font-size:12.0pt;mso-fareast-font-family:"Times
                New Roman";
                mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">
              </span></b></a><a href="#csc"><b><span
                style="font-size:12.0pt;mso-fareast-font-family:"Times
                New Roman";
                mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">- 


                CSC Deceptive Speech  -</span></b></a></b><a href="#csc"><b><span
              style="font-size:12.0pt;mso-fareast-font-family:"Times

              New Roman";
              mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin"></span></b></a></div>
      <a href="#csc"><b><span
            style="font-size:12.0pt;mso-fareast-font-family:"Times
            New Roman";
            mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin"></span></b></a></div>
    <a href="#csc"><b><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";
          mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">
        </span></b></a><a
      style="mso-comment-reference:dd_1;mso-comment-date:20131112T1735"><b><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";
          mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin"></span></b></a><a
      style="mso-comment-reference:dd_1;mso-comment-date:20131112T1735"><b><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";
          mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">
        </span></b></a>
    <hr size="2" width="100%"><a
      style="mso-comment-reference:dd_1;mso-comment-date:20131112T1735"><b><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";
          mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin"></span></b></a><span
style="font-size:12.0pt;mso-fareast-font-family:SimSun;mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin"><o:p></o:p></span>
    <p class="MsoNormal"
      style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
      line-height:normal"><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";
        mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin"></span></p>
    <p class="MsoNormal"
      style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
      line-height:normal"><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";
        mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin"> </span><br>
      <a name="scholar"></a><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";
        mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin"><b><span
            style="font-size:12.0pt;mso-fareast-font-family:"Times
            New Roman";
mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin;color:black">Spring
2014



            LDC Data Scholarship Program</span></b><span
          style="font-size:12.0pt; mso-fareast-font-family:"Times
          New Roman";mso-bidi-font-family:Calibri;
          mso-bidi-theme-font:minor-latin"> <br>
          <br>
          <span style="color:black;mso-bidi-font-weight:bold">Applications



            are now being accepted through Wednesday, January 15, 2014,
            11:59PM EST for the Spring 20143 LDC Data Scholarship
            program! The LDC Data Scholarship program provides
            university students with access to LDC data at no-cost.
            During previous program cycles, LDC has awarded no-cost
            copies of LDC data to over 35 individual students and
            student research groups.</span><br>
          <br>
          <span style="color:black;mso-bidi-font-weight:bold">This
            program is open to students pursuing both undergraduate and
            graduate studies in an accredited college or university. LDC
            Data Scholarships are not restricted to any particular field
            of study; however, students must demonstrate a
            well-developed research agenda and a bona fide inability to
            pay. The selection process is highly competitive. </span><br>
          <br>
          <span style="color:black;mso-bidi-font-weight:bold">The
            application consists of two parts: </span><br>
          <br>
          <span style="color:black;mso-bidi-font-weight:bold">(1) Data
            Use Proposal. Applicants must submit a proposal describing
            their intended use of the data. The proposal should state
            which data the student plans to use and how the data will
            benefit their research project as well as information on the
            proposed methodology or algorithm.</span><br>
          <br>
          <span style="color:black;mso-bidi-font-weight:bold">Applicants
            should consult the </span></span><a
          href="http://catalog.ldc.upenn.edu/" target="_blank"><span
            style="font-size:12.0pt;mso-fareast-font-family:"Times
            New Roman";mso-bidi-font-family:
Calibri;mso-bidi-theme-font:minor-latin;mso-bidi-font-weight:bold">LDC <span
              style="mso-spacerun:yes"> </span>Catalog</span></a><span
          style="font-size:12.0pt; mso-fareast-font-family:"Times
          New Roman";mso-bidi-font-family:Calibri;
          mso-bidi-theme-font:minor-latin"> for a complete list of data
          distributed by LDC. Due to certain restrictions, a handful of
          LDC corpora are restricted to members of the Consortium.
          Applicants are advised to select a maximum of one to two
          datasets; students may apply for additional datasets during
          the following cycle once they have completed processing of the
          initial datasets and publish or present work in some juried
          venue.<br>
          <br>
          <span style="color:black;mso-bidi-font-weight:bold">(2) Letter
            of Support. Applicants must submit one letter of support
            from their thesis adviser or department chair. The letter
            must verify the student's need for data and confirm that the
            department or university lacks the funding to pay the full
            Non-member Fee for the data or to join the Consortium.</span>
          <br>
          <br>
          <span style="color:black;mso-bidi-font-weight:bold">For
            further information on application materials and program
            rules, please visit the </span></span><a
href="https://www.ldc.upenn.edu/language-resources/data/data-scholarships"
          target="_blank"><span
            style="font-size:12.0pt;mso-fareast-font-family:"Times
            New Roman";
            mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin;color:#0000CC;



            mso-bidi-font-weight:bold">LDC Data Scholarship</span></a><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";mso-bidi-font-family:
          Calibri;mso-bidi-theme-font:minor-latin"> page. <br>
          <br>
          <span style="mso-bidi-font-weight:bold">Students can email
            their applications to the </span></span><a
          href="mailto:datascholarships@ldc.upenn.edu"><span
            style="font-size:12.0pt;mso-fareast-font-family:"Times
            New Roman";mso-bidi-font-family:
Calibri;mso-bidi-theme-font:minor-latin;mso-bidi-font-weight:bold">LDC
            Data Scholarship program</span></a><span
          style="font-size:12.0pt;mso-fareast-font-family: "Times
          New
Roman";mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin;color:black;mso-bidi-font-weight:bold">.
          Decisions will be sent by email from the same address.</span><span
          style="font-size:12.0pt;mso-fareast-font-family: "Times
          New
Roman";mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin"><br>
          <br>
          <span style="color:black;mso-bidi-font-weight:bold">The
            deadline for the Spring 2014 program cycle is January 15,
            2014, 11:59PM EST.<br>
          </span></span></span></p>
    <p class="MsoNormal"
      style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
      line-height:normal"><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";
        mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin"> <b>New
          publications</b><br>
        <br style="mso-special-character:line-break">
      </span> <a name="ctb"></a><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";
        mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">(1)



      </span><a href="http://catalog.ldc.upenn.edu/LDC2013T21"><span
          style="font-size:12.0pt; mso-fareast-font-family:"Times
          New Roman";mso-bidi-font-family:Calibri;
          mso-bidi-theme-font:minor-latin">Chinese Treebank 8.0</span></a><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";mso-bidi-font-family:
        Calibri;mso-bidi-theme-font:minor-latin"> consists of
        approximately 1.5 million words of annotated and parsed text
        from Chinese newswire, government documents, magazine articles,
        various broadcast news and broadcast conversation programs, web
        newsgroups and weblogs.<o:p></o:p></span></p>
    <p class="MsoNormal"
      style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
      line-height:normal"><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";
        mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">The



        Chinese Treebank project began at the University of Pennsylvania
        in 1998, continued at the University of Colorado and then moved
        to </span><a
        href="http://www.cs.brandeis.edu/%7Ellc/page2/page2.html"><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";mso-bidi-font-family:
          Calibri;mso-bidi-theme-font:minor-latin">Brandeis University</span></a><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";mso-bidi-font-family:
        Calibri;mso-bidi-theme-font:minor-latin">. The project’s goal is
        to provide a large, part-of-speech tagged and fully bracketed
        Chinese language corpus. The first delivery, Chinese Treebank
        1.0, contained 100,000 syntactically annotated words from Xinhua
        News Agency newswire. It was later corrected and released in
        2001 as </span><a
        href="http://catalog.ldc.upenn.edu/LDC2001T11"><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";mso-bidi-font-family:
          Calibri;mso-bidi-theme-font:minor-latin">Chinese Treebank 2.0
          (LDC2001T11)</span></a><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";mso-bidi-font-family:
        Calibri;mso-bidi-theme-font:minor-latin"> and consisted of
        approximately 100,000 words. The LDC released </span><a
        href="http://catalog.ldc.upenn.edu/LDC2004T05"><span
          style="font-size:12.0pt; mso-fareast-font-family:"Times
          New Roman";mso-bidi-font-family:Calibri;
          mso-bidi-theme-font:minor-latin">Chinese Treebank 4.0
          (LDC2004T05)</span></a><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";mso-bidi-font-family:
        Calibri;mso-bidi-theme-font:minor-latin">, an updated version
        containing roughly 400,000 words, in 2004. A year later, LDC
        published the 500,000 word </span><a
        href="http://catalog.ldc.upenn.edu/LDC2005T01"><span
          style="font-size:12.0pt; mso-fareast-font-family:"Times
          New Roman";mso-bidi-font-family:Calibri;
          mso-bidi-theme-font:minor-latin">Chinese Treebank 5.0
          (LDC2005T01)</span></a><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";mso-bidi-font-family:
        Calibri;mso-bidi-theme-font:minor-latin">. </span><a
        href="http://catalog.ldc.upenn.edu/LDC2007T36"><span
          style="font-size:12.0pt; mso-fareast-font-family:"Times
          New Roman";mso-bidi-font-family:Calibri;
          mso-bidi-theme-font:minor-latin">Chinese Treebank 6.0
          (LDC2007T36)</span></a><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";mso-bidi-font-family:
        Calibri;mso-bidi-theme-font:minor-latin">, released in 2007,
        consisted of 780,000 words. </span><a
        href="http://catalog.ldc.upenn.edu/LDC2010T07"><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";mso-bidi-font-family:
          Calibri;mso-bidi-theme-font:minor-latin">Chinese Treebank 7.0
          (LDC2010T08)</span></a><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";mso-bidi-font-family:
        Calibri;mso-bidi-theme-font:minor-latin">, released in 2010,
        added new annotated newswire data, broadcast material and web
        text to the approximate total of one million words. Chinese
        Treebank 8.0 adds new annotated data from newswire, magazine
        articles and government documents.<o:p></o:p></span></p>
    <p class="MsoNormal"
      style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
      line-height:normal"><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";
        mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">There



        are 3,007 text files in this release, containing 71,369
        sentences, 1,620,561 words, 2,589,848 characters (hanzi or
        foreign). The data is provided in UTF-8 encoding, and the
        annotation has Penn Treebank-style labeled brackets. Details of
        the annotation standard can be found in the <span
          style="mso-spacerun:yes"> </span>segmentation, POS-tagging and
        bracketing guidelines included in the release. The data is
        provided in four different formats: raw text, word segmented,
        POS-tagged, and syntactically bracketed formats. All files were
        automatically verified and manually checked.<o:p></o:p></span></p>
    <br>
    <p class="MsoNormal" style="margin-bottom:0in;margin-bottom:.0001pt;
      text-align:center;line-height:normal" align="center"><span
        style="font-size:12.0pt;mso-fareast-font-family: "Times New
Roman";mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">*<br
          style="mso-special-character:line-break">
        <br style="mso-special-character:line-break">
        <o:p></o:p></span></p>
    <p class="MsoNormal"
      style="margin-bottom:0in;margin-bottom:.0001pt;line-height:
      normal"><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";
        mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin"><br>
      </span> <a name="csc"></a><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";
        mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">(2)



      </span><a href="http://catalog.ldc.upenn.edu/LDC2013S09"><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";mso-bidi-font-family:
          Calibri;mso-bidi-theme-font:minor-latin">CSC Deceptive Speech</span></a><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";mso-bidi-font-family:
        Calibri;mso-bidi-theme-font:minor-latin"> was developed by
        Columbia University, SRI International and University of
        Colorado Boulder. It consists of 32 hours of audio interview
        from 32 native speakers of Standard American English (16 male,
        16 female) recruited from the Columbia University student
        population and the community. The purpose of the study was to
        distinguish deceptive speech from non-deceptive speech using
        machine learning techniques on extracted features from the
        corpus. <o:p></o:p></span></p>
    <p class="MsoNormal"
      style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
      line-height:normal"><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";
        mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">The



        participants were told that they were participating in a
        communication experiment which sought to identify people who fit
        the profile of the top entrepreneurs in America. To this end,
        the participants performed tasks and answered questions in six
        areas. Tthey were later told that they had received low scores
        in some of those areas and did not fit the profile. The subjects
        then participated in an interview where they were told to
        convince the interviewer that they had actually achieved high
        scores in all areas and that they did indeed fit the profile.
        The task of the interviewer was to determine how he thought the
        subjects had actually performed, and he was allowed to ask them
        any questions other than those that were part of the performed
        tasks. For each question from the interviewer, subjects were
        asked to indicate whether the reply was true or contained any
        false information by pressing one of two pedals hidden from the
        interviewer under a table.<o:p></o:p></span></p>
    <p class="MsoNormal"
      style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
      line-height:normal"><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";
        mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">Interviews



        were conducted in a double-walled sound booth and recorded to
        digital audio tape on two channels using Crown CM311A Differoid
        headworn close-talking microphones, then down sampled to 16kHz
        before processing. <o:p></o:p></span></p>
    <p class="MsoNormal"
      style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
      line-height:normal"><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";
        mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">The



        interviews were orthographically transcribed by hand using the
        NIST EARS transcription guidelines. Labels for local lies were
        obtained automatically from the pedal-press data and
        hand-corrected for alignment, and labels for global lies were
        annotated during transcription based on the known scores of the
        subjects versus their reported scores. The orthographic
        transcription was force-aligned using the SRI telephone speech
        recognizer adapted for full-bandwidth recordings. There are
        several segmentations associated with the corpus: the implicit
        segmentation of the pedal presses, derived semi-automatically
        sentence-like units (EARS SLASH-UNITS or SUs) which were hand
        labeled, intonational phrase units and the units corresponding
        to each topic of the interview.<o:p></o:p></span></p>
    <span style="font-size:12.0pt; mso-fareast-font-family:"Times
      New Roman";mso-bidi-font-family:Calibri;
      mso-bidi-theme-font:minor-latin"><o:p></o:p></span>
    <p class="MsoNormal"><o:p> </o:p><br>
    </p>
    <hr size="2" width="100%">
    <pre class="moz-signature" cols="72">-- 
--

Ilya Ahtaridis
Membership Coordinator
--------------------------------------------------------------------
Linguistic Data Consortium                  Phone: 1 (215) 573-1275
University of Pennsylvania                    Fax: 1 (215) 573-2175
3600 Market St., Suite 810                        <a class="moz-txt-link-abbreviated" href="mailto:ldc@ldc.upenn.edu">ldc@ldc.upenn.edu</a>
Philadelphia, PA 19104 USA                 <a class="moz-txt-link-freetext" href="http://www.ldc.upenn.edu">http://www.ldc.upenn.edu</a>
</pre>
    <pre class="moz-signature" cols="72">

</pre>
  </body>
</html>