<html>
<head>
<meta http-equiv="content-type" content="text/html;
charset=ISO-8859-1">
</head>
<body bgcolor="#FFFFFF" text="#000000">
Our apologies if you have received multiple copies of this
announcement. <br>
<br>
***************************************************************** <br>
ELRA - Language Resources Catalogue - Update <br>
***************************************************************** <br>
<font face="Times New Roman, Times, serif"><big><small><br>
</small></big></font>ELRA is happy to announce that 4 new Speech
Resources from the GlobalPhone corpus are now available in its
catalogue.<br>
Moreover, an updated version of the Venice Italian Treebank (VIT)
has also been released. <br>
<b><br>
1) New Language Resources:<br>
<br>
The GlobalPhone Corpus: </b>The GlobalPhone corpus was designed
to provide read speech data for the development and evaluation of
large continuous speech recognition systems in the most widespread
languages of the world, and to provide a uniform, multilingual
speech and text database for language independent and language
adaptive speech recognition as well as for language identification
tasks. The entire GlobalPhone corpus enables the acquisition of
acoustic-phonetic knowledge of the following 19 spoken languages
Arabic (ELRA-S0192), Bulgarian (ELRA-S0319), Chinese-Mandarin
(ELRA-S0193), Chinese-Shanghai (ELRA-S0194), Croatian (ELRA-S0195),
Czech (ELRA-S0196), French (ELRA-S0197), German (ELRA-S0198),
Japanese (ELRA-S0199), Korean (ELRA-S0200), Polish (ELRA-S0320),
Portuguese (Brazilian) (ELRA-S0201), Russian (ELRA-S0202), Spanish
(Latin America) (ELRA-S0203), Swedish (ELRA-S0204), Tamil
(ELRA-S0205), Thai (ELRA-S0321), Turkish (ELRA-S0206), Vietnamese
(ELRA-S0322). In each language about 100 sentences were read from
each of the 100 speakers. The read texts were selected from national
newspapers available via Internet to provide a large vocabulary (up
to 65,000 words). The read articles cover national and international
political news as well as economic news. <br>
<br>
Special prices are offered for a combined purchase of several
GlobalPhone languages (5 languages, 10 languages, 15 languages or 19
languages).<b><br>
<br>
</b>New 4 languages are available from the GlobalPhone corpus<b>:<br>
</b><b>ELRA-S0319 GlobalPhone Bulgarian</b><big><big><big><big><big><span
class="apple-style-span"><span style="font-size: 8pt;
color: black;" lang="EN-GB"><big><br>
</big></span></span></big></big></big></big></big>For
more information, see: <a
href="http://catalog.elra.info/product_info.php?products_id=1141">http://catalog.elra.info/product_info.php?products_id=1141</a><br>
<b>ELRA-S0320</b><b> GlobalPhone Polish</b><big><big><big><big><big><span
class="apple-style-span"><span style="font-size: 8pt;
color: black;" lang="EN-GB"><big><br>
</big></span></span></big></big></big></big></big>For
more information, see: <a
href="http://catalog.elra.info/product_info.php?products_id=1142">http://catalog.elra.info/product_info.php?products_id=1142</a><br>
<b>ELRA-S0321 </b><b>GlobalPhone Thai</b><big><big><big><big><big><span
class="apple-style-span"><span style="font-size: 8pt;
color: black;" lang="EN-GB"><big><br>
</big></span></span></big></big></big></big></big>For
more information, see: <a
href="http://catalog.elra.info/product_info.php?products_id=1143">http://catalog.elra.info/product_info.php?products_id=1143</a><br>
<b>ELRA-S0322 </b><b>GlobalPhone Vietnamese</b><big><big><big><big><big><span
class="apple-style-span"><span style="font-size: 8pt;
color: black;" lang="EN-GB"><big><br>
</big></span></span></big></big></big></big></big>For
more information, see: <a
href="http://catalog.elra.info/product_info.php?products_id=1144">http://catalog.elra.info/product_info.php?products_id=1144</a><br>
<br>
<br>
<b>2) Update of </b><b>ELRA-W0040 Venice Italian Treebank (VIT)</b><b>:</b><br>
The new version of VIT has a totally revised constituent-based
representation and a completely new dependency-based representation
which has been achieved by semi-automatic procedures.<b><br>
<br>
</b>The VIT, Venice Italian Treebank contains about 272,000 words
distributed over six different domains: bureaucratic, political,
economic and financial, literary, scientific, and news. In addition,
some 60,000 tokens of spoken dialogues in different Italian
varieties were annotated.<br>
The annotation follows general X-bar criteria with 29 constituency
labels and 102 PoS tags. VIT is also made available in a broad
annotation version with 10 constituency labels and 22 PoS tags for
machine learning purposes. The format is plain text with square
bracketing. However, a UPenn style version which is readable by the
open source query language CorpusSearch is also provided.<br>
<b><br>
</b>For more information, see: <a
href="http://catalog.elra.info/product_info.php?products_id=831">http://catalog.elra.info/product_info.php?products_id=831</a><br>
<br>
</body>
</html>