<html>

  <head>


    <meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    Our apologies if you have received multiple copies of this

    announcement. <br>

    <br>

    ***************************************************************** <br>

    ELRA - Language Resources Catalogue - Update <br>

    ***************************************************************** <br>

    <br>

    <font face="Times New Roman, Times, serif"><big><small> </small></big></font>ELRA


    is happy to announce that 12 new Written Corpora from the PANACEA

    Project are now available for free in its catalogue.<br>

    <br>

    PANACEA, a STREP Project under EU-FP7, has developed a factory of

    Language Resources (LRs) in the form of a production line that

    automates all steps involved in the acquisition, production,

    maintenance and updating of the LRs required by Machine Translation

    and other Language Technologies.<br>

    The factory is a Web Service-based platform that integrates advanced

    technological components for:<br>

    - Monolingual and Parallel Text Acquisition and Pre-Processing<br>

    - Parallel corpora Alignment<br>

    - Bilingual Dictionary Production<br>

    - Monolingual Rich Information Lexica Production<br>

    <br>

    The project produced a set of resources:<br>

    - Monolingual Corpora (raw text, n-grams, parsed, for Greek,

    English, Spanish, French and Italian)<br>

    - Parallel Corpora (sentence aligned, English-Greek and

    English-French)<br>

    - Monolingual Lexica (Verbal Subcategorization for English, Spanish

    and Italian, Noun Lexical-Semantic Classes, for English and Spanish;

    MultiWords for Italian)<br>

    - Bilingual Glossaries (Greek-English, French-English,

    German-English)<br>

    <br>

    For more information on the available corpora, please refer to:<br>

    <br>

    <b>ELRA-W0057 PANACEA English-French and English-Greek parallel

      corpus acquired for Environment domain</b><br>

    This package consists of an English-French and English-Greek

    sentence-aligned parallel corpus from the Environment domain

    automatically acquired from the web during 2010 and 2011. It was

    acquired in the framework of the PANACEA project. Data and language

    pairs are split into training, test and development test sets. <br>

    For more information, see: <a

href="http://catalog.elra.info/product_info.php?products_id=1182&language=en">http://catalog.elra.info/product_info.php?products_id=1182&language=en<br>

    </a> <br>

    <b>ELRA-W0058 PANACEA English-French and English-Greek parallel

      corpus acquired for Labour Legislation domain</b><br>

    This package consists of an English-French and English-Greek

    sentence-aligned parallel corpus from the Labour Legislation domain

    automatically acquired from the web during 2010 and 2011. It was

    acquired in the framework of the PANACEA project. Data and language

    pairs are split into training, test and development test sets. <br>

    For more information, see: <a

href="http://catalog.elra.info/product_info.php?products_id=1183&language=en">http://catalog.elra.info/product_info.php?products_id=1183&language=en<br>

    </a> <br>

    <b>ELRA-W0063 PANACEA Environment English monolingual corpus</b><br>

    This corpus consists of documents that were acquired from the web,

    were automatically detected to be in the English language and were

    automatically classified as relevant to the "Environment" domain. It

    was constructed in the summer of 2011. It contains 50,541,538

    tokens, divided into a total of 28,071 documents that were crawled

    from 3,121 web sites. <br>

    For more information, see: <a

href="http://catalog.elra.info/product_info.php?products_id=1184&language=en">http://catalog.elra.info/product_info.php?products_id=1184&language=en<br>

    </a> <br>

    <b>ELRA-W0064 PANACEA Labour English monolingual corpus</b><br>

    This corpus consists of documents that were acquired from the web,

    were automatically detected to be in the English language and were

    automatically classified as relevant to the "Labour Legislation"

    domain. It was constructed in the summer of 2011. It contains

    46,431,351 tokens, divided into a total of 15,197 documents that

    were crawled from 1,558 web sites.<br>

    For more information, see: <a

href="http://catalog.elra.info/product_info.php?products_id=1185&language=en">http://catalog.elra.info/product_info.php?products_id=1185&language=en</a><br>

    <br>

    <b>ELRA-W0065 PANACEA Environment French monolingual corpus</b><br>

    This corpus consists of documents that were acquired from the web,

    were automatically detected to be in the French language and were

    automatically classified as relevant to the "Environment" domain. It

    was constructed in the summer of 2011. It contains 47,364,125

    tokens, divided into a total of 23,514 documents that were crawled

    from 1,969 web sites.<br>

    For more information, see: <a

href="http://catalog.elra.info/product_info.php?products_id=1186&language=en">http://catalog.elra.info/product_info.php?products_id=1186&language=en<br>

    </a> <br>

    <b>ELRA-W0066 PANACEA Labour French monolingual corpus</b><br>

    This corpus consists of documents that were acquired from the web,

    were automatically detected to be in the French language and were

    automatically classified as relevant to the "Labour Legislation"

    domain. It was constructed in the summer of 2011. It contains

    56,440,425 tokens, divided into a total of 26,675 documents that

    were crawled from 1,391 web sites. <br>

    For more information, see: <a

href="http://catalog.elra.info/product_info.php?products_id=1187&language=en">http://catalog.elra.info/product_info.php?products_id=1187&language=en<br>

    </a> <br>

    <b>ELRA-W0067 PANACEA Environment Greek monolingual corpus</b><br>

    This corpus consists of documents that were acquired from the web,

    were automatically detected to be in the Greek language and were

    automatically classified as relevant to the "Environment" domain. It

    was constructed in the summer of 2011. It contains 27,958,530

    tokens, divided into a total of 16,073 documents that were crawled

    from 1,063 web sites. <br>

    For more information, see: <a

href="http://catalog.elra.info/product_info.php?products_id=1188&language=en">http://catalog.elra.info/product_info.php?products_id=1188&language=en<br>

    </a> <br>

    <b>ELRA-W0068 PANACEA Labour Greek monolingual corpus</b><br>

    This corpus consists of documents that were acquired from the web,

    were automatically detected to be in the Greek language and were

    automatically classified as relevant to the "Labour Legislation"

    domain. It was constructed in the summer of 2011. It contains

    21,077,196 tokens, divided into a total of 7,124 documents that were

    crawled from 598 web sites.<br>

    For more information, see: <a

href="http://catalog.elra.info/product_info.php?products_id=1189&language=en">http://catalog.elra.info/product_info.php?products_id=1189&language=en<br>

    </a> <br>

    <b>ELRA-W0069 PANACEA Environment Italian monolingual corpus</b><br>

    This corpus consists of documents that were acquired from the web,

    were automatically detected to be in the Italian language and were

    automatically classified as relevant to the "Environment" domain. It

    was constructed in the summer of 2011. It contains 40,044,852

    tokens, divided into a total of 16,159 documents that were crawled

    from 1,211 web sites. <br>

    For more information, see: <a

href="http://catalog.elra.info/product_info.php?products_id=1190&language=en">http://catalog.elra.info/product_info.php?products_id=1190&language=en<br>

    </a> <br>

    <b>ELRA-W0070 PANACEA Labour Italian monolingual corpus</b><br>

    This corpus consists of documents that were acquired from the web,

    were automatically detected to be in the Italian language and were

    automatically classified as relevant to the "Labour Legislation"

    domain. It was constructed in the summer of 2011. It contains

    70,563,320 tokens, divided into a total of 12,706 documents that

    were crawled from 864 web sites.<br>

    For more information, see: <a

href="http://catalog.elra.info/product_info.php?products_id=1191&language=en">http://catalog.elra.info/product_info.php?products_id=1191&language=en<br>

    </a> <br>

    <b>ELRA-W0071 PANACEA Environment Spanish monolingual corpus</b><br>

    This corpus consists of documents that were acquired from the web,

    were automatically detected to be in the Spanish language and were

    automatically classified as relevant to the "Environment" domain. It

    was constructed in the summer of 2011. It contains 46,225,624

    tokens, divided into a total of 26,009 documents that were crawled

    from 2,053 web sites. <br>

    For more information, see: <a

href="http://catalog.elra.info/product_info.php?products_id=1192&language=en">http://catalog.elra.info/product_info.php?products_id=1192&language=en<br>

    </a> <br>

    <b>ELRA-W0072 PANACEA Labour Spanish monolingual corpus</b><br>

    This corpus consists of documents that were acquired from the web,

    were automatically detected to be in the Spanish language and were

    automatically classified as relevant to the "Labour Legislation"

    domain. It was constructed in the summer of 2011. It contains

    53,922,118 tokens, divided into a total of 13,188 documents that

    were crawled from 1,015 web sites. <br>

    For more information, see: <a

href="http://catalog.elra.info/product_info.php?products_id=1193&language=en">http://catalog.elra.info/product_info.php?products_id=1193&language=en</a><br>

    <br>

    To find out more about PANACEA, please visit the following website:

    <a moz-do-not-send="true" href="http://www.panacea-lr.eu">http://www.panacea-lr.eu</a><br>

    <br>

    For more information on the catalogue, please contact Valérie

    Mapelli <a moz-do-not-send="true" class="moz-txt-link-freetext"

      href="mailto:mapelli@elda.org">mailto:mapelli@elda.org</a> <br>

    <br>

    Visit our On-line Catalogue: <a moz-do-not-send="true"

      class="moz-txt-link-freetext" href="http://catalog.elra.info">http://catalog.elra.info</a><br>

    Visit the Universal Catalogue: <a moz-do-not-send="true"

      href="http://universal.elra.info">http://universal.elra.info</a> <br>

    Archives of ELRA Language Resources Catalogue Updates: <a

      moz-do-not-send="true" class="moz-txt-link-freetext"

      href="http://www.elra.info/LRs-Announcements.html">http://www.elra.info/LRs-Announcements.html</a>

  </body>

</html>