<html>
  <head>

    <meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <p class="MsoNormal"
      style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
      line-height:normal" align="center"><b><span
          style="mso-bidi-font-weight:normal"><span
            style="font-size:12.0pt;mso-fareast-font-family:"Times
            New Roman";mso-bidi-font-family:
            Calibri;mso-bidi-theme-font:minor-latin"><span
              style="mso-bidi-font-weight:normal"><span
                style="font-size:12.0pt;mso-fareast-font-family:"Times
                New Roman";mso-bidi-font-family:
                Calibri;mso-bidi-theme-font:minor-latin">-  <a
                  href="#scholar">Fall 2013 Data Scholarship Program</a> 
                -<br>
              </span></span></span></span></b></p>
    <p class="MsoNormal"
      style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
      line-height:normal" align="center"><i><span
          style="mso-bidi-font-weight:normal"><span
            style="font-size:12.0pt;mso-fareast-font-family:"Times
            New Roman";mso-bidi-font-family:
            Calibri;mso-bidi-theme-font:minor-latin"><span
              style="mso-bidi-font-weight:normal"><span
                style="font-size:12.0pt;mso-fareast-font-family:"Times
                New Roman";mso-bidi-font-family:
                Calibri;mso-bidi-theme-font:minor-latin">New
                publications:</span></span></span></span></i><b><span
          style="mso-bidi-font-weight:normal"><span
            style="font-size:12.0pt;mso-fareast-font-family:"Times
            New Roman";mso-bidi-font-family:
            Calibri;mso-bidi-theme-font:minor-latin"><span
              style="mso-bidi-font-weight:normal"><span
                style="font-size:12.0pt;mso-fareast-font-family:"Times
                New Roman";mso-bidi-font-family:
                Calibri;mso-bidi-theme-font:minor-latin"><br>
              </span></span></span></span></b></p>
    <p class="MsoNormal"
      style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
      line-height:normal" align="center"><b><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";
          mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">- 
          <a href="#prop">Chinese Proposition Bank 3.0</a>  -<br>
        </span></b></p>
    <p class="MsoNormal"
      style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
      line-height:normal" align="center"><b><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";
          mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">- 
          <a href="#gale">GALE Arabic-English Parallel Aligned Treebank
            -- Broadcast News Part 1</a> -</span></b></p>
    <hr size="2" width="100%">
    <p class="MsoNormal"
      style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
      line-height:normal" align="center"><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";
        mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin"></span></p>
    <p class="MsoNormal"
      style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
      line-height:normal" align="center"><a name="scholar"></a><b
        style="mso-bidi-font-weight:normal"><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";mso-bidi-font-family:
          Calibri;mso-bidi-theme-font:minor-latin">Fall 2013 Data
          Scholarship Program<o:p></o:p></span></b></p>
    <p class="MsoNormal"
      style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
      line-height:normal"><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";
        mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">Applications


        are now being accepted through September 16, 2013, 11:59PM EST
        for the Fall 2013 LDC Data Scholarship program! The LDC Data
        Scholarship program provides university students with access to
        LDC data at no-cost.<o:p></o:p></span></p>
    <p class="MsoNormal"
      style="margin-bottom:0in;margin-bottom:.0001pt;line-height:
      normal"><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";
        mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin"><br>
        This program is open to students pursuing both undergraduate and
        graduate studies in an accredited college or university. LDC
        Data Scholarships are not restricted to any particular field of
        study; however, students must demonstrate a well-developed
        research agenda and a bona fide inability to pay. The selection
        process is highly competitive. <br>
        <br>
        The application consists of two parts: <br>
        <br>
        (1) <i>Data Use Proposal</i>. Applicants must submit a proposal
        describing their intended use of the data. The proposal should
        state which data the student plans to use and how the data will
        benefit their research project as well as information on the
        proposed methodology or algorithm.<br>
        <br>
        Applicants should consult the </span><a
        href="http://www.ldc.upenn.edu/Catalog/index.jsp"
        target="_blank"><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";mso-bidi-font-family:
          Calibri;mso-bidi-theme-font:minor-latin;color:blue">LDC Corpus
          Catalog</span></a><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";mso-bidi-font-family:
        Calibri;mso-bidi-theme-font:minor-latin"> for a complete list of
        data distributed by LDC. Due to certain restrictions, a handful
        of LDC corpora are restricted to members of the Consortium.
        Applicants are advised to select a maximum of one to two
        databases.<br>
        <br>
        (2) <i>Letter of Support</i>. Applicants must submit one letter
        of support from their thesis adviser or department chair. The
        letter must confirm that the department or university lacks the
        funding to pay the full Non-member Fee for the data and verify
        the student's need for data. <br>
        <br>
        For further information on application materials and program
        rules, please visit the </span><a
        href="http://www.ldc.upenn.edu/About/scholarships.html"
        target="_blank"><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";
mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin;color:blue">LDC
Data


          Scholarship</span></a><span
        style="font-size:12.0pt;mso-fareast-font-family: "Times New
Roman";mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">
        page. <br>
        <br>
        Students can email their applications to the </span><a
        href="mailto:datascholarships@ldc.upenn.edu"><span
          style="font-size:12.0pt; mso-fareast-font-family:"Times
          New Roman";mso-bidi-font-family:Calibri;
          mso-bidi-theme-font:minor-latin;color:blue">LDC Data
          Scholarship program</span></a><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";mso-bidi-font-family:
        Calibri;mso-bidi-theme-font:minor-latin">. Decisions will be
        sent by email from the same address.<br>
        <br>
        The deadline for the Fall 2013 program<span
          style="mso-spacerun:yes"> </span>is Monday, September 16,
        2013, 11:59PM EST.<br>
        <br>
      </span></p>
    <p class="MsoNormal"
      style="margin-bottom:0in;margin-bottom:.0001pt;line-height:
      normal"><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";mso-bidi-font-family:
        Calibri;mso-bidi-theme-font:minor-latin"> <br>
      </span><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";mso-bidi-font-family:
        Calibri;mso-bidi-theme-font:minor-latin"> <b
          style="mso-bidi-font-weight:normal">                       
                                                                     
              New publications</b></span><br>
      <span style="font-size:12.0pt;mso-fareast-font-family:"Times
        New Roman";mso-bidi-font-family:
        Calibri;mso-bidi-theme-font:minor-latin"></span></p>
    <p class="MsoNormal"
      style="margin-bottom:0in;margin-bottom:.0001pt;line-height:
      normal"><br>
      <span style="font-size:12.0pt;mso-fareast-font-family:"Times
        New Roman";mso-bidi-font-family:
        Calibri;mso-bidi-theme-font:minor-latin"><o:p></o:p></span></p>
    <p class="MsoNormal"
      style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
      line-height:normal"><a name="prop"></a><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";
        mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">(1)


      </span><a
href="http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2013T13"><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";mso-bidi-font-family:
          Calibri;mso-bidi-theme-font:minor-latin;color:blue">Chinese
          Proposition Bank 3.0</span></a><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";
        mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">
        is a continuation of the </span><a
        href="http://www.cs.brandeis.edu/%7Eclp/ctb/cpb/"><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";mso-bidi-font-family:
          Calibri;mso-bidi-theme-font:minor-latin;color:blue">Chinese
          Proposition Bank</span></a><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";mso-bidi-font-family:
        Calibri;mso-bidi-theme-font:minor-latin"> project which aims to
        create a corpus of text annotated with information about basic
        semantic propositions. Chinese Proposition Bank 3.0 adds
        predicate-argument annotation on 187,731 words from Chinese
        Treebank 7.0 (</span><a
href="http://www.ldc.upenn.edu/Catalog/catalogEntry.jsp?catalogId=LDC2010T07"><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";mso-bidi-font-family:
          Calibri;mso-bidi-theme-font:minor-latin;color:blue">LDC2010T07</span></a><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";mso-bidi-font-family:
        Calibri;mso-bidi-theme-font:minor-latin">). The data sources are
        comprised of newswire, magazine articles, various broadcast news
        and broadcast conversation programming, web newsgroups and
        weblogs. <o:p></o:p></span></p>
    <p class="MsoNormal"
      style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
      line-height:normal"><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";
        mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">LDC


        has also released Chinese Proposition Bank 1.0 (</span><a
href="http://www.ldc.upenn.edu/Catalog/catalogEntry.jsp?catalogId=LDC2005T23"><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";mso-bidi-font-family:
          Calibri;mso-bidi-theme-font:minor-latin;color:blue">LDC2005T23</span></a><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";mso-bidi-font-family:
        Calibri;mso-bidi-theme-font:minor-latin">) and Chinese
        Proposition Bank 2.0 (</span><a
href="http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2008T07"><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";mso-bidi-font-family:
          Calibri;mso-bidi-theme-font:minor-latin;color:blue">LDC2008T07</span></a><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";mso-bidi-font-family:
        Calibri;mso-bidi-theme-font:minor-latin">).<o:p></o:p></span></p>
    <p class="MsoNormal"
      style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
      line-height:normal"><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";
        mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">This


        release contains the predicate-argument annotation of 173,206
        verb instances and 14,525 noun instances. The annotation of
        nouns is limited to nominalizations that have a corresponding
        verb. The general annotation guidelines and the lexical
        guidelines (called frame files) for each verbal and nominal
        predicate are also included in this release. Below are some
        statistics about the corpus.<o:p></o:p></span></p>
    <ul type="disc">
      <li class="MsoNormal"
        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
        line-height:normal;mso-list:l1 level1 lfo1;tab-stops:list .5in"><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";
          mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">Total



          propositions for verbs - 173,206<o:p></o:p></span></li>
      <li class="MsoNormal"
        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
        line-height:normal;mso-list:l1 level1 lfo1;tab-stops:list .5in"><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";
          mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">Total



          propositions for nouns - 14,525<o:p></o:p></span></li>
      <li class="MsoNormal"
        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
        line-height:normal;mso-list:l1 level1 lfo1;tab-stops:list .5in"><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";
          mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">Total


          verbs framed - 24,642<o:p></o:p></span></li>
      <li class="MsoNormal"
        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
        line-height:normal;mso-list:l1 level1 lfo1;tab-stops:list .5in"><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";
          mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">Total



          framesets - 26,467<o:p></o:p></span></li>
      <li class="MsoNormal"
        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
        line-height:normal;mso-list:l1 level1 lfo1;tab-stops:list .5in"><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";
          mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">Verbs


          with multiple framesets - 1337<o:p></o:p></span></li>
      <li class="MsoNormal"
        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
        line-height:normal;mso-list:l1 level1 lfo1;tab-stops:list .5in"><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";
          mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">Average



          framesets per verb - 1.07<o:p></o:p></span></li>
      <li class="MsoNormal"
        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
        line-height:normal;mso-list:l1 level1 lfo1;tab-stops:list .5in"><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";
          mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">Total


          nouns framed - 1,421<o:p></o:p></span></li>
      <li class="MsoNormal"
        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
        line-height:normal;mso-list:l1 level1 lfo1;tab-stops:list .5in"><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";
          mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">Total


          noun framesets - 1,528<o:p></o:p></span></li>
      <li class="MsoNormal"
        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
        line-height:normal;mso-list:l1 level1 lfo1;tab-stops:list .5in"><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";
          mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">Nouns


          with multiple framesets - 48<o:p></o:p></span></li>
      <li class="MsoNormal"
        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
        line-height:normal;mso-list:l1 level1 lfo1;tab-stops:list .5in"><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";
          mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">Average



          framesets per nouns - 1.08<o:p></o:p></span></li>
    </ul>
    <span style="font-size:12.0pt;mso-fareast-font-family:"Times
      New Roman";
      mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin"><o:p></o:p></span>
    <p class="MsoNormal"
      style="mso-margin-top-alt:auto;mso-margin-bottom-alt:
      auto;text-align:center;line-height:normal" align="center"><span
        style="font-size:12.0pt; mso-fareast-font-family:"Times New
        Roman";mso-bidi-font-family:Calibri;
        mso-bidi-theme-font:minor-latin">*<o:p></o:p></span></p>
    <p class="MsoNormal"
      style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
      line-height:normal"><a name="gale"></a><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";
        mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">(2)


      </span><a
href="http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2013T14"><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";mso-bidi-font-family:
          Calibri;mso-bidi-theme-font:minor-latin;color:blue">GALE
          Arabic-English Parallel Aligned Treebank -- Broadcast News
          Part 1</span></a><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";mso-bidi-font-family:
        Calibri;mso-bidi-theme-font:minor-latin"> was developed by LDC
        and contains 115,826 tokens of word aligned Arabic and English
        parallel text with treebank annotations. This material was used
        as training data in the DARPA GALE (Global Autonomous Language
        Exploitation) program.<o:p></o:p></span></p>
    <p class="MsoNormal"
      style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
      line-height:normal"><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";
        mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">Parallel


        aligned treebanks are treebanks annotated with morphological and
        syntactic structures aligned at the sentence level and the
        sub-sentence level. Such data sets are useful for natural
        language processing and related fields, including automatic word
        alignment system training and evaluation, transfer-rule
        extraction, word sense disambiguation, translation lexicon
        extraction and cultural heritage and cross-linguistic studies.
        With respect to machine translation system development, parallel
        aligned treebanks may improve system performance with enhanced
        syntactic parsers, better rules and knowledge about language
        pairs and reduced word error rate.<o:p></o:p></span></p>
    <p class="MsoNormal"
      style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
      line-height:normal"><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";
        mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">In
        this release, the source Arabic data was translated into
        English. Arabic and English treebank annotations were performed
        independently. The parallel texts were then word aligned. The
        material in this corpus corresponds to a portion of the Arabic
        treebanked data in Arabic Treebank - Broadcast News v1.0 (</span><a
href="http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2012T07"><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";mso-bidi-font-family:
          Calibri;mso-bidi-theme-font:minor-latin;color:blue">LDC2012T07</span></a><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";mso-bidi-font-family:
        Calibri;mso-bidi-theme-font:minor-latin">).<o:p></o:p></span></p>
    <p class="MsoNormal"
      style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
      line-height:normal"><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";
        mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">The


        source data consists of Arabic broadcast news programming
        collected by LDC in 2005 and 2006 from Alhurra, Aljazeera and
        Dubai TV. All data is encoded as UTF-8. A count of files, words,
        tokens and segments is below.<o:p></o:p></span></p>
    <table class="MsoNormalTable" style="mso-cellspacing:1.5pt;
      mso-yfti-tbllook:1184" border="1" cellpadding="0">
      <tbody>
        <tr style="mso-yfti-irow:0;mso-yfti-firstrow:yes">
          <td style="padding:.75pt .75pt .75pt .75pt">
            <p class="MsoNormal"
              style="margin-bottom:0in;margin-bottom:.0001pt;line-height:
              normal"><span
                style="font-size:12.0pt;mso-fareast-font-family:"Times
                New Roman";
                mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">Language<o:p></o:p></span></p>
          </td>
          <td style="padding:.75pt .75pt .75pt .75pt">
            <p class="MsoNormal"
              style="margin-bottom:0in;margin-bottom:.0001pt;line-height:
              normal"><span
                style="font-size:12.0pt;mso-fareast-font-family:"Times
                New Roman";
                mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">Files<o:p></o:p></span></p>
          </td>
          <td style="padding:.75pt .75pt .75pt .75pt">
            <p class="MsoNormal"
              style="margin-bottom:0in;margin-bottom:.0001pt;line-height:
              normal"><span
                style="font-size:12.0pt;mso-fareast-font-family:"Times
                New Roman";
                mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">Words<o:p></o:p></span></p>
          </td>
          <td style="padding:.75pt .75pt .75pt .75pt">
            <p class="MsoNormal"
              style="margin-bottom:0in;margin-bottom:.0001pt;line-height:
              normal"><span
                style="font-size:12.0pt;mso-fareast-font-family:"Times
                New Roman";
                mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">Tokens<o:p></o:p></span></p>
          </td>
          <td style="padding:.75pt .75pt .75pt .75pt">
            <p class="MsoNormal"
              style="margin-bottom:0in;margin-bottom:.0001pt;line-height:
              normal"><span
                style="font-size:12.0pt;mso-fareast-font-family:"Times
                New Roman";
                mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">Segments<o:p></o:p></span></p>
          </td>
        </tr>
        <tr style="mso-yfti-irow:1;mso-yfti-lastrow:yes">
          <td style="padding:.75pt .75pt .75pt .75pt">
            <p class="MsoNormal"
              style="margin-bottom:0in;margin-bottom:.0001pt;line-height:
              normal"><span
                style="font-size:12.0pt;mso-fareast-font-family:"Times
                New Roman";
                mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">Arabic<o:p></o:p></span></p>
          </td>
          <td style="padding:.75pt .75pt .75pt .75pt">
            <p class="MsoNormal"
              style="margin-bottom:0in;margin-bottom:.0001pt;line-height:
              normal"><span
                style="font-size:12.0pt;mso-fareast-font-family:"Times
                New Roman";
                mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">28<o:p></o:p></span></p>
          </td>
          <td style="padding:.75pt .75pt .75pt .75pt">
            <p class="MsoNormal"
              style="margin-bottom:0in;margin-bottom:.0001pt;line-height:
              normal"><span
                style="font-size:12.0pt;mso-fareast-font-family:"Times
                New Roman";
                mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">89,213<o:p></o:p></span></p>
          </td>
          <td style="padding:.75pt .75pt .75pt .75pt">
            <p class="MsoNormal"
              style="margin-bottom:0in;margin-bottom:.0001pt;line-height:
              normal"><span
                style="font-size:12.0pt;mso-fareast-font-family:"Times
                New Roman";
                mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">115,826<o:p></o:p></span></p>
          </td>
          <td style="padding:.75pt .75pt .75pt .75pt">
            <p class="MsoNormal"
              style="margin-bottom:0in;margin-bottom:.0001pt;line-height:
              normal"><span
                style="font-size:12.0pt;mso-fareast-font-family:"Times
                New Roman";
                mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">4,824<o:p></o:p></span></p>
          </td>
        </tr>
      </tbody>
    </table>
    <p class="MsoNormal"
      style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
      line-height:normal"><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";
        mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">Note:


        Word count is based on the untokenized Arabic source. Ttoken
        count is based on the ATB-tokenized Arabic source.<o:p></o:p></span></p>
    <p class="MsoNormal"
      style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
      line-height:normal"><span
        style="font-size:12.0pt;mso-fareast-font-family:"Times New
        Roman";
        mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">The


        purpose of the GALE word alignment task was to find
        correspondences between words, phrases or groups of words in a
        set of parallel texts. Arabic-English word alignment annotation
        consisted of the following tasks:<o:p></o:p></span></p>
    <ul type="disc">
      <li class="MsoNormal"
        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
        line-height:normal;mso-list:l0 level1 lfo2;tab-stops:list .5in"><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";
          mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">Identifying



          different types of links: translated (correct or incorrect)
          and not translated (correct or incorrect)<o:p></o:p></span></li>
      <li class="MsoNormal"
        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
        line-height:normal;mso-list:l0 level1 lfo2;tab-stops:list .5in"><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";
          mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">Identifying



          sentence segments not suitable for annotation, e.g., blank
          segments, incorrectly-segmented segments, segments with
          foreign languages<o:p></o:p></span></li>
      <li class="MsoNormal"
        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
        line-height:normal;mso-list:l0 level1 lfo2;tab-stops:list .5in"><span
          style="font-size:12.0pt;mso-fareast-font-family:"Times
          New Roman";
          mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin">Tagging



          unmatched words attached to other words or phrases<o:p></o:p></span></li>
    </ul>
    <br>
    <hr size="2" width="100%"><br>
    <pre class="moz-signature" cols="72">-- 
--

Ilya Ahtaridis
Membership Coordinator
--------------------------------------------------------------------
Linguistic Data Consortium                  Phone: 1 (215) 573-1275
University of Pennsylvania                    Fax: 1 (215) 573-2175
3600 Market St., Suite 810                        <a class="moz-txt-link-abbreviated" href="mailto:ldc@ldc.upenn.edu">ldc@ldc.upenn.edu</a>
Philadelphia, PA 19104 USA                 <a class="moz-txt-link-freetext" href="http://www.ldc.upenn.edu">http://www.ldc.upenn.edu</a></pre>
    <pre class="moz-signature" cols="72">

</pre>
  </body>
</html>