Ressources: ELRA - Language Resources Catalogue - Update

Thierry Hamon thierry.hamon at LIPN.UNIV-PARIS13.FR
Mon May 5 11:59:52 UTC 2008


Date: Tue, 22 Apr 2008 16:28:01 +0200
From: Valerie Mapelli <mapelli at elda.org>
Message-Id: <6.0.3.0.2.20080422161847.03da6ff0 at pop.easynet.fr>
X-url: http://catalog.elra.info/product_info.php?products_id=1062
X-url: http://catalog.elra.info/product_info.php?products_id=1063
X-url: http://catalog.elra.info/product_info.php?products_id=1064
X-url: http://catalog.elra.info/product_info.php?products_id=1058
X-url: http://catalog.elra.info/product_info.php?products_id=1060
X-url: http://catalog.elra.info/product_info.php?products_id=1059
X-url: http://catalog.elra.info/product_info.php?products_id=1061


Our apologies if you have received multiple copies of this
announcement.  Please note that you receive this email because you are
or have been a customer or a provider of ELRA Language Resources.


*******************************************************************
ELRA - Language Resources Catalogue - Update
*****************************************************************


ELRA is happy to announce that 2 new Speech Resources from the LC-STAR
project, 1 new Broadcast News Resource, and 4 new Bilingual Lexicons,
are now available in its catalogue.

ELRA-S0273 LC-STAR Slovenian Phonetic lexicon
The LC-STAR Slovenian Phonetic lexicon comprises 110,900 entries,
including a set of 64,521 common words, a set of 45,012 proper names
(including person names, family names, cities, streets, companies and
brand names) and a list of 5,491 special application words. The
lexicon is provided in XML format and includes phonetic transcriptions
in SAMPA.
For more information, see: 
http://catalog.elra.info/product_info.php?products_id=1062

ELRA-S0274 LC-STAR English-Slovenian Bilingual Aligned Phrasal lexicon
The LC-STAR English-Slovenian Bilingual Aligned Phrasal lexicon
comprises 12,722 phrases from the tourist domain. It is based on a
list of short sentences obtained by translation from a US-English
10,522 phrase corpus.  The lexicon is provided in XML format.  For
more information, see:
http://catalog.elra.info/product_info.php?products_id=1063

ELRA-S0275 Slovenian BNSI Broadcast News Speech Corpus
This speech database consists of TV news shows (both evening news, "TV
Dnevnik" and late night news, "Odmevi"), from the archive of a
Slovenian national broadcaster RTV Slovenia. The recordings took place
between June 1999 and May 2003. The database comprises a total of 36
hours of recordings, transcribed and manually checked using the
Transcriber tool.  1,565 speakers were recorded (1,069 males, 477
females, 19 unspecified).
For more information, see: 
http://catalog.elra.info/product_info.php?products_id=1064

ELRA-M0043 Russian => English MT optimized lexicon in OLIF XML
This lexicon is provided in structured XML of OLIF (Open Lexicon
Interchange Format) format. It comprises 99,211 entries in its source
language (Russian) and 134,828 entries in its target language
(English).  The source entries are distributed as follows: 64,487
nouns, 11,470 adjectives, 19,724 verbs, 1,762 adverbs, and 1,768
closed-class elements (interjections, special prefixes, suffixes,
etc.). Nouns contain gender and number information and verbs provide
details on aspect and reflexivity. The entries contain semantic
information in terms of domain specification or style information
(e.g., colloquial, regional use, etc.). Moreover, definitions are
available for 59,775 entries, as well as collocational information for
39,148 entries.
For more information, see: 
http://catalog.elra.info/product_info.php?products_id=1058

ELRA-M0044 English => Swahili Bilingual Lexicon
This lexicon is provided in structured XML of OLIF (Open Lexicon
Interchange Format) format. It comprises 58,247 entries in English and
58,300 in Swahili. The source entries are distributed as follows:
36,046 nouns, 3,013 adjectives, 18,308 verbs and 880 closed-class
entries. The entries contain semantic information in terms of domain
specification or style information (e.g., colloquial, regional use,
etc.). Collocational information is also available for 17,570 entries.
For more information, see: 
http://catalog.elra.info/product_info.php?products_id=1060

ELRA-M0045 Cebuano => English Bilingual Lexicon
This lexicon is provided in structured XML of OLIF (Open Lexicon
Interchange Format) format. It comprises 1,988 entries in Cebuano and
1,990 in English. The source entries are distributed as follows: 1,052
nouns, 462 adjectives, 405 verbs and 69 closed-class entries. The
entries contain semantic information in terms of domain specification
or style information (e.g., colloquial, regional use,
etc.). Collocational information is also available for 500 entries.
For more information, see: 
http://catalog.elra.info/product_info.php?products_id=1059

ELRA-M0046 English => Czech Bilingual Lexicon
This lexicon is provided in structured XML of OLIF (Open Lexicon
Interchange Format) format. It comprises 31,718 entries in English and
32,125 in Czech. The source entries are distributed as follows: 17,797
nouns, 7,748 adjectives, 6,039 verbs and 134 closed-class entries. The
entries contain semantic information in terms of domain specification
or style information (e.g., colloquial, regional use,
etc.). Collocational information is also available for 3,065 entries.
For more information, see: 
http://catalog.elra.info/product_info.php?products_id=1061


For more information on the catalogue, please contact Valérie Mapelli
mailto:mapelli at elda.org

Visit our on-line catalogue: 
http://catalog.elra.info.


-------------------------------------------------------------------------
Message diffuse par la liste Langage Naturel <LN at cines.fr>
Informations, abonnement : http://www.atala.org/article.php3?id_article=48
English version       : 
Archives                 : http://listserv.linguistlist.org/archives/ln.html
                                http://liste.cines.fr/info/ln

La liste LN est parrainee par l'ATALA (Association pour le Traitement
Automatique des Langues)
Information et adhesion  : http://www.atala.org/
-------------------------------------------------------------------------



More information about the Ln mailing list