<html>
  <head>

    <meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <p class="MsoNormal" style="margin-bottom:12.0pt;line-height:normal"
      align="center"><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";mso-bidi-font-family:
        Calibri;mso-bidi-theme-font:minor-latin"><o:p></o:p></span></p>
    <p class="MsoNormal" style="margin-bottom:12.0pt;line-height:normal"
      align="center"><span style="font-size:12.0pt;
        mso-fareast-font-family:"Times New
        Roman";mso-bidi-font-family:Calibri;
        mso-bidi-theme-font:minor-latin"><a href="#scholar"><b>- Spring
            2013 LDC Data Scholarship Program -</b></a></span><a
        href="#scholar"><br>
      </a> </p>
    <p class="MsoNormal" style="margin-bottom:0in;margin-bottom:.0001pt;
      text-align:center;line-height:normal" align="center"><a
        href="#pdtb"><b style="mso-bidi-font-weight:normal"><span
            style="font-size:12.0pt;mso-fareast-font-family:"Times
            New Roman";mso-bidi-font-family:
            Calibri;mso-bidi-theme-font:minor-latin">-  Penn Discourse
            Treebank Version 2.0 Update  -</span></b></a><b
        style="mso-bidi-font-weight:normal"><br>
      </b><i><span style="font-size:12.0pt;mso-fareast-font-family:
          "Times New
Roman";mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin"><br>
          New publications:</span></i></p>
    <p class="MsoNormal" style="margin-bottom:0in;margin-bottom:.0001pt;
      text-align:center;line-height:normal" align="center"><span
        style="font-size:12.0pt;mso-fareast-font-family: "Times New
Roman";mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin"></span><a
        href="#gale"><b><span
            style="font-size:12.0pt;mso-fareast-font-family:"Times
            New Roman";
            mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">- 

            GALE Chinese-English Word Alignment and Tagging Training
            Part 3 -- Web  -<br>
          </span></b></a></p>
    <p class="MsoNormal" style="margin-bottom:0in;margin-bottom:.0001pt;
      text-align:center;line-height:normal" align="center"><a
        href="#russian"><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";
          mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">- 
          <b>Russian-English Computer Security Parallel Text</b>  -</span></a></p>
    <p class="MsoNormal" style="margin-bottom:12.0pt;line-height:normal"
      align="center"></p>
    <p class="MsoNormal"
      style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
      line-height:normal" align="center"><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";
        mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin"><o:p></o:p></span>
    </p>
    <br>
    <b style="mso-bidi-font-weight:normal"><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";mso-bidi-font-family:
        Calibri;mso-bidi-theme-font:minor-latin"></span></b>
    <hr size="2" width="100%">
    <p class="MsoNormal"
      style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
      line-height:normal" align="center"><span style="font-size:12.0pt;
        mso-fareast-font-family:"Times New
        Roman";mso-bidi-font-family:Calibri;
        mso-bidi-theme-font:minor-latin"> </span><a name="scholar"></a><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";
        mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin"><b>Spring
          2013 LDC Data Scholarship Program </b></span></p>
    <p class="MsoNormal"
      style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
      line-height:normal"><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";
        mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin"></span><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";
        mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">The



        deadline for the Spring 2013 LDC Data Scholarship Program is one
        month away!   Student applications are being accepted now
        through January 15, 2013<span
          style="color:black;mso-bidi-font-weight:bold">, 11:59PM EST</span>. 



        The LDC Data Scholarship program provides university students
        with access to LDC data at no cost.  This program is open to
        students pursuing both undergraduate and graduate studies in an
        accredited college or university. LDC Data Scholarships are not
        restricted to any particular field of study; however, students
        must demonstrate a well-developed research agenda and a bona
        fide inability to pay.  <br>
        <br>
        Students will need to complete an application which consists of
        a data use proposal and letter of support from their adviser. 
        For further information on application materials and program
        rules, please visit the </span><span
style="font-size:12.0pt;mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin"><a
          href="http://www.ldc.upenn.edu/About/scholarships.html"><span
            style="mso-fareast-font-family: "Times New Roman"">LDC


            Data Scholarship</span></a></span><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";mso-bidi-font-family:
        Calibri;mso-bidi-theme-font:minor-latin"> page.  <o:p></o:p></span></p>
    <p class="MsoNormal"
      style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
      line-height:normal"><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";
        mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">Students



        can email their applications to the </span><span
        style="font-size:12.0pt;
        mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin"><a
          href="mailto:datascholarships@ldc.upenn.edu"><span
            style="mso-fareast-font-family: "Times New Roman"">LDC


            Data Scholarship program</span></a></span><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";mso-bidi-font-family:
        Calibri;mso-bidi-theme-font:minor-latin">. Decisions will be
        sent by email from the same address.</span></p>
    <p class="MsoNormal"
      style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
      line-height:normal"><br>
      <span style="font-size:12.0pt;mso-fareast-font-family:"Times
        New Roman";mso-bidi-font-family:
        Calibri;mso-bidi-theme-font:minor-latin"><o:p></o:p></span></p>
    <span style="font-size:12.0pt;mso-fareast-font-family:"Times
      New Roman";
      mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin"><o:p></o:p></span>
    <p class="MsoNormal" style="margin-bottom:0in;margin-bottom:.0001pt;
      text-align:center;line-height:normal" align="center"><a
        name="pdtb"></a><b style="mso-bidi-font-weight:normal"><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";mso-bidi-font-family:
          Calibri;mso-bidi-theme-font:minor-latin">Penn Discourse
          Treebank Version 2.0 Update</span></b></p>
    <p class="MsoNormal" style="margin-bottom:0in;margin-bottom:.0001pt;
      text-align:center;line-height:normal" align="center"><br>
      <b style="mso-bidi-font-weight:normal"><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";mso-bidi-font-family:
          Calibri;mso-bidi-theme-font:minor-latin"><o:p></o:p></span></b></p>
    <small><span
style="font-family:"Calibri","sans-serif";mso-ascii-theme-font:minor-latin;mso-hansi-theme-font:minor-latin;mso-bidi-theme-font:minor-latin">The
developers



        of the Penn Discourse Treebank Version 2.0 <a
href="http://www.ldc.upenn.edu/Catalog/catalogEntry.jsp?catalogId=LDC2008T05">LDC2008T05</a>
        (PDTB) have updated this release to add metadata to the Wall
        Street Journal (WSJ) news stories in the corpus. The goal is to
        aid understanding PDTB files as texts and to support
        distinguishing texts from different genres within the WSJ. <o:p></o:p></span>
      <br>
      <span
style="font-family:"Calibri","sans-serif";mso-ascii-theme-font:minor-latin;mso-hansi-theme-font:minor-latin;mso-bidi-theme-font:minor-latin">The



        metadata includes the following fields: <o:p></o:p></span></small>
    <ul type="disc">
      <li class="MsoNormal"
        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
        line-height:normal;mso-list:l0 level1 lfo2;tab-stops:list .5in"><small><span
            style="font-size:12.0pt;mso-bidi-font-family:Calibri;mso-bidi-theme-font:




            minor-latin">DD: the date the article appeared in the WSJ <o:p></o:p></span></small></li>
      <li class="MsoNormal"
        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
        line-height:normal;mso-list:l0 level1 lfo2;tab-stops:list .5in"><small><span
            style="font-size:12.0pt;mso-bidi-font-family:Calibri;mso-bidi-theme-font:




            minor-latin">AN: unique identifier for the article <o:p></o:p></span></small></li>
      <li class="MsoNormal"
        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
        line-height:normal;mso-list:l0 level1 lfo2;tab-stops:list .5in"><small><span
            style="font-size:12.0pt;mso-bidi-font-family:Calibri;mso-bidi-theme-font:




            minor-latin">HL: the column name (for regular features such
            as Who's News, Marketing & Media, Technology), its
            headline and by-line <o:p></o:p></span></small></li>
      <li class="MsoNormal"
        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
        line-height:normal;mso-list:l0 level1 lfo2;tab-stops:list .5in"><small><span
            style="font-size:12.0pt;mso-bidi-font-family:Calibri;mso-bidi-theme-font:




            minor-latin">SO: the source of the article <o:p></o:p></span></small></li>
      <li class="MsoNormal"
        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
        line-height:normal;mso-list:l0 level1 lfo2;tab-stops:list .5in"><small><span
            style="font-size:12.0pt;mso-bidi-font-family:Calibri;mso-bidi-theme-font:




            minor-latin">IN: manually-assigned codes or keywords for the
            article <o:p></o:p></span></small></li>
      <li class="MsoNormal"
        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
        line-height:normal;mso-list:l0 level1 lfo2;tab-stops:list .5in"><small><span
            style="font-size:12.0pt;mso-bidi-font-family:Calibri;mso-bidi-theme-font:




            minor-latin">CO: manually-assigned codes for companies or
            other organizations <o:p></o:p></span></small></li>
      <li class="MsoNormal"
        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
        line-height:normal;mso-list:l0 level1 lfo2;tab-stops:list .5in"><small><span
            style="font-size:12.0pt;mso-bidi-font-family:Calibri;mso-bidi-theme-font:




            minor-latin">DATELINE: normally the location where the
            article was filed, but sometimes has very unexpected
            contents <o:p></o:p></span></small></li>
      <li class="MsoNormal"
        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
        line-height:normal;mso-list:l0 level1 lfo2;tab-stops:list .5in"><small><span
            style="font-size:12.0pt;mso-bidi-font-family:Calibri;mso-bidi-theme-font:




            minor-latin">GV: Branch of Government or Government Agency
            mentioned in the article <o:p></o:p></span></small></li>
      <li class="MsoNormal"
        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
        line-height:normal;mso-list:l0 level1 lfo2;tab-stops:list .5in"><small><span
            style="font-size:12.0pt;mso-bidi-font-family:Calibri;mso-bidi-theme-font:




            minor-latin">SBREAKS: the byte position of section breaks
            present in the file <o:p></o:p></span></small></li>
      <li class="MsoNormal"
        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
        line-height:normal;mso-list:l0 level1 lfo2;tab-stops:list .5in"><small><span
            style="font-size:12.0pt;mso-bidi-font-family:Calibri;mso-bidi-theme-font:




            minor-latin">ARTICLEBREAK: separates files that contain more
            than one article <o:p></o:p></span></small></li>
    </ul>
    <small><span
style="font-family:"Calibri","sans-serif";mso-ascii-theme-font:minor-latin;mso-hansi-theme-font:minor-latin;mso-bidi-theme-font:minor-latin">Contact
        LDC to obtain the update.</span></small><br>
    <p class="MsoNormal"
      style="margin-bottom:0in;margin-bottom:.0001pt;line-height:
      normal"><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";
        mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin"><o:p> </o:p></span></p>
    <br>
    <p class="MsoNormal" style="margin-bottom:12.0pt;text-align:center;
      line-height:normal" align="center"><big><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";
          mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin"><br>
          <a style="mso-comment-reference:dd_3;mso-comment-date:
            20121211T1312"><span style="mso-bookmark:_GoBack"><b
                style="mso-bidi-font-weight: normal">New publications</b></span></a><span
            class="MsoCommentReference"><span
              style="font-family:"Calibri","sans-serif";
              mso-ascii-theme-font:minor-latin;mso-hansi-theme-font:minor-latin;mso-bidi-theme-font:



              minor-latin"><span style="mso-special-character:comment"></span></span></span><o:p></o:p></span></big></p>
    <p class="MsoNormal"
      style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
      line-height:normal"><big><a name="gale"></a><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";
          mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">(1)



        </span><span
style="font-size:12.0pt;mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin"><a
href="http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2012T24"><span
              style="mso-fareast-font-family:"Times New
              Roman"">GALE Chinese-English Word Alignment and
              Tagging Training Part 3 -- Web</span></a></span><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";mso-bidi-font-family:
          Calibri;mso-bidi-theme-font:minor-latin"> was developed by LDC
          and contains 154,541 tokens of word aligned Chinese and
          English parallel text enriched with linguistic tags. This
          material was used as training data in the </span><span
style="font-size:12.0pt;mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin"><a
            href="http://projects.ldc.upenn.edu/gale/index.html"><span
              style="mso-fareast-font-family: "Times New
              Roman"">DARPA GALE</span></a></span><span
          style="font-size:12.0pt; mso-fareast-font-family:"Times
          New Roman";mso-bidi-font-family:Calibri;
          mso-bidi-theme-font:minor-latin"> (Global Autonomous Language
          Exploitation) program.<o:p></o:p></span></big></p>
    <p class="MsoNormal"
      style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
      line-height:normal"><big><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";
          mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">Some



          approaches to statistical machine translation include the
          incorporation of linguistic knowledge in word aligned text as
          a means to improve automatic word alignment and machine
          translation quality. This is accomplished with two annotation
          schemes: alignment and tagging. Alignment identifies minimum
          translation units and translation relations by using
          minimum-match and attachment annotation approaches. A set of
          word tags and alignment link tags are designed in the tagging
          scheme to describe these translation units and relations.
          Tagging adds contextual, syntactic and language-specific
          features to the alignment annotation. <br>
        </span></big></p>
    <p class="MsoNormal"
      style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
      line-height:normal"><big><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";
          mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">GALE

          Chinese-English Word Alignment and Tagging Training Part 1 --
          Newswire and Web (LDC2012T16) and GALE Chinese-English Word
          Alignment and Tagging Training Part 3 -- Web (LDC2012T20) are
          also available through LDC.<br>
          <br>
          This release consists of Chinese source web data (newsgroup,
          weblog) collected by LDC in 2008 and 2009. The distribution by
          words, character tokens and segments appears below: <o:p></o:p></span></big></p>
    <big><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";
        mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin"><o:p></o:p></span></big>
    <big> </big><big> </big><big> </big><big> </big><big> </big><big>
    </big><big> </big><big> </big><big> </big><big> </big><big> </big><big>
    </big><big> </big><big> </big><big> </big><big> </big><big> </big>
    <table class="MsoNormalTable" style="mso-cellspacing:1.5pt;
      mso-yfti-tbllook:1184" border="1" cellpadding="0">
      <tbody>
        <tr style="mso-yfti-irow:0;mso-yfti-firstrow:yes">
          <td style="padding:.75pt .75pt .75pt .75pt"><big> </big>
            <p class="MsoNormal"
              style="margin-bottom:0in;margin-bottom:.0001pt;line-height:
              normal"><big><span
                  style="font-size:12.0pt;mso-fareast-font-family:"Times
                  New Roman";
                  mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">Language<o:p></o:p></span></big></p>
            <big> </big></td>
          <td style="padding:.75pt .75pt .75pt .75pt"><big> </big>
            <p class="MsoNormal"
              style="margin-bottom:0in;margin-bottom:.0001pt;line-height:
              normal"><big><span
                  style="font-size:12.0pt;mso-fareast-font-family:"Times
                  New Roman";
                  mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">Files<o:p></o:p></span></big></p>
            <big> </big></td>
          <td style="padding:.75pt .75pt .75pt .75pt"><big> </big>
            <p class="MsoNormal"
              style="margin-bottom:0in;margin-bottom:.0001pt;line-height:
              normal"><big><span
                  style="font-size:12.0pt;mso-fareast-font-family:"Times
                  New Roman";
                  mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">Words<o:p></o:p></span></big></p>
            <big> </big></td>
          <td style="padding:.75pt .75pt .75pt .75pt"><big> </big>
            <p class="MsoNormal"
              style="margin-bottom:0in;margin-bottom:.0001pt;line-height:
              normal"><big><span
                  style="font-size:12.0pt;mso-fareast-font-family:"Times
                  New Roman";
                  mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">CharTokens<o:p></o:p></span></big></p>
            <big> </big></td>
          <td style="padding:.75pt .75pt .75pt .75pt"><big> </big>
            <p class="MsoNormal"
              style="margin-bottom:0in;margin-bottom:.0001pt;line-height:
              normal"><big><span
                  style="font-size:12.0pt;mso-fareast-font-family:"Times
                  New Roman";
                  mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">Segments<o:p></o:p></span></big></p>
            <big> </big></td>
        </tr>
        <tr style="mso-yfti-irow:1;mso-yfti-lastrow:yes">
          <td style="padding:.75pt .75pt .75pt .75pt"><big> </big>
            <p class="MsoNormal"
              style="margin-bottom:0in;margin-bottom:.0001pt;line-height:
              normal"><big><span
                  style="font-size:12.0pt;mso-fareast-font-family:"Times
                  New Roman";
                  mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">Chinese<o:p></o:p></span></big></p>
            <big> </big></td>
          <td style="padding:.75pt .75pt .75pt .75pt"><big> </big>
            <p class="MsoNormal"
              style="margin-bottom:0in;margin-bottom:.0001pt;line-height:
              normal"><big><span
                  style="font-size:12.0pt;mso-fareast-font-family:"Times
                  New Roman";
                  mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">1249<o:p></o:p></span></big></p>
            <big> </big></td>
          <td style="padding:.75pt .75pt .75pt .75pt"><big> </big>
            <p class="MsoNormal"
              style="margin-bottom:0in;margin-bottom:.0001pt;line-height:
              normal"><big><span
                  style="font-size:12.0pt;mso-fareast-font-family:"Times
                  New Roman";
                  mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">103027<o:p></o:p></span></big></p>
            <big> </big></td>
          <td style="padding:.75pt .75pt .75pt .75pt"><big> </big>
            <p class="MsoNormal"
              style="margin-bottom:0in;margin-bottom:.0001pt;line-height:
              normal"><big><span
                  style="font-size:12.0pt;mso-fareast-font-family:"Times
                  New Roman";
                  mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">154541<o:p></o:p></span></big></p>
            <big> </big></td>
          <td style="padding:.75pt .75pt .75pt .75pt"><big> </big>
            <p class="MsoNormal"
              style="margin-bottom:0in;margin-bottom:.0001pt;line-height:
              normal"><big><span
                  style="font-size:12.0pt;mso-fareast-font-family:"Times
                  New Roman";
                  mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">4842<o:p></o:p></span></big></p>
            <big> </big></td>
        </tr>
      </tbody>
    </table>
    <p class="MsoNormal"
      style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
      line-height:normal"><big><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";
          mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin"><br>
          Note that all token counts are based on the Chinese data only.
          One token is equivalent to one character and one word is
          equivalent to 1.5 characters.<o:p></o:p></span></big></p>
    <p class="MsoNormal"
      style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
      line-height:normal"><big><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";
          mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">The



          Chinese word alignment tasks consisted of the following
          components: <o:p></o:p></span></big></p>
    <ul type="disc">
      <li class="MsoNormal"
        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
        line-height:normal;mso-list:l1 level1 lfo1;tab-stops:list .5in"><big><span
            style="font-size:12.0pt;mso-fareast-font-family:"Times
            New Roman";
            mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">Identifying,




            aligning, and tagging 8 different types of links<o:p></o:p></span></big></li>
      <li class="MsoNormal"
        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
        line-height:normal;mso-list:l1 level1 lfo1;tab-stops:list .5in"><big><span
            style="font-size:12.0pt;mso-fareast-font-family:"Times
            New Roman";
            mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">Identifying,




            attaching, and tagging local-level unmatched words<o:p></o:p></span></big></li>
      <li class="MsoNormal"
        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
        line-height:normal;mso-list:l1 level1 lfo1;tab-stops:list .5in"><big><span
            style="font-size:12.0pt;mso-fareast-font-family:"Times
            New Roman";
            mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">Identifying




            and tagging sentence/discourse-level unmatched words<o:p></o:p></span></big></li>
      <li class="MsoNormal"
        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
        line-height:normal;mso-list:l1 level1 lfo1;tab-stops:list .5in"><big><span
            style="font-size:12.0pt;mso-fareast-font-family:"Times
            New Roman";
            mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">Identifying




            and tagging all instances of Chinese </span><span
            style="font-size:12.0pt; font-family:"MS
            Mincho";mso-ascii-font-family:Calibri;mso-ascii-theme-font:
minor-latin;mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">的</span><span
            style="font-size:12.0pt;mso-fareast-font-family:"Times
            New Roman";
            mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">(DE)



            except when they were a part of a semantic link.<o:p></o:p></span></big></li>
    </ul>
    <br style="mso-special-character:line-break">
    <p class="MsoNormal"
      style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
      line-height:normal"><big><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";
          mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">
          <br style="mso-special-character:line-break">
          <o:p></o:p></span></big></p>
    <p class="MsoNormal"
      style="mso-margin-top-alt:auto;mso-margin-bottom-alt:
      auto;text-align:center;line-height:normal" align="center"><big><span
          style="font-size:12.0pt; mso-fareast-font-family:"Times
          New Roman";mso-bidi-font-family:Calibri;
          mso-bidi-theme-font:minor-latin">*<br
            style="mso-special-character:line-break">
          <br style="mso-special-character:line-break">
          <o:p></o:p></span></big></p>
    <p class="MsoNormal"
      style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
      line-height:normal"><big><a name="russian"></a><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";
          mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">(2)



        </span><span
style="font-size:12.0pt;mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin"><a
href="http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2012T23"><span
              style="mso-fareast-font-family:"Times New
              Roman"">Russian-English Computer Security Parallel
              Text</span></a></span><span style="font-size:12.0pt;
          mso-fareast-font-family:"Times New
          Roman";mso-bidi-font-family:Calibri;
          mso-bidi-theme-font:minor-latin"> was developed by </span><span
style="font-size:12.0pt;mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin"><a
            href="http://www.mitre.org/"><span
              style="mso-fareast-font-family:"Times New
              Roman"">The MITRE Corporation</span></a></span><span
          style="font-size:12.0pt;mso-fareast-font-family: "Times
          New
Roman";mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">.
          It consists of parallel sentences from a set of computer
          security reports published in Russian and translated into
          English by translators with particular expertise in the
          technical area. Translators were instructed to err on the side
          of literal translation if required, but to maintain the
          technical writing style of the source and to make the
          resulting English as natural as possible. The translators
          followed specific guidelines for translation, and those are
          included in this distribution.<o:p></o:p></span></big></p>
    <p class="MsoNormal"
      style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
      line-height:normal"><big><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";
          mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">There



          are 6,276 lines of parallel Russian and English, with a total
          of 60,059 words of Russian and 76,437 words of English,
          presented in a separate UTF-8 plain text file for each
          language. The sentences were translated in sequential order
          and presented in a scrambled order, such that parallel
          sentences at identical line numbers are translations. For
          example, the 31st line of the English file is a translation of
          the 31st line of the Russian file. The original line sequence
          is not provided. 1,694 untranslated lines (such as code
          snippets) are included as a separate file.<o:p></o:p></span></big></p>
    <p class="MsoNormal"
      style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
      line-height:normal"><big><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";
          mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin"></span></big></p>
    <br>
    <hr size="2" width="100%">
    <div style="mso-element:comment-list">
      <div style="mso-element:comment">
        <div id="_com_1" class="msocomtxt" language="JavaScript"
          onmouseover="msoCommentShow('_anchor_1','_com_1')"
          onmouseout="msoCommentHide('_com_1')"><br>
          <o:p></o:p></div>
      </div>
      <div style="mso-element:comment">
        <div id="_com_3" class="msocomtxt" language="JavaScript"
          onmouseover="msoCommentShow('_anchor_3','_com_3')"
          onmouseout="msoCommentHide('_com_3')"> </div>
      </div>
    </div>
    <div class="moz-text-html" lang="x-western">
      <link rel="File-List"
href="file:///C:%5CUsers%5Celefthea%5CAppData%5CLocal%5CTemp%5Cmsohtmlclip1%5C01%5Cclip_filelist.xml">
      <link rel="themeData"
href="file:///C:%5CUsers%5Celefthea%5CAppData%5CLocal%5CTemp%5Cmsohtmlclip1%5C01%5Cclip_themedata.thmx">
      <link rel="colorSchemeMapping"
href="file:///C:%5CUsers%5Celefthea%5CAppData%5CLocal%5CTemp%5Cmsohtmlclip1%5C01%5Cclip_colorschememapping.xml">
      <pre class="moz-signature" cols="72">-- 
--

Ilya Ahtaridis
Membership Coordinator
--------------------------------------------------------------------
Linguistic Data Consortium                  Phone: 1 (215) 573-1275
University of Pennsylvania                    Fax: 1 (215) 573-2175
3600 Market St., Suite 810                        <a class="moz-txt-link-abbreviated" href="mailto:ldc@ldc.upenn.edu">ldc@ldc.upenn.edu</a>
Philadelphia, PA 19104 USA                 <a class="moz-txt-link-freetext" href="http://www.ldc.upenn.edu">http://www.ldc.upenn.edu</a>




</pre>
    </div>
  </body>
</html>