Info: ELRA - Language Resources Catalogue - Update

Thierry Hamon thierry.hamon at UNIV-PARIS13.FR
Wed Sep 29 07:54:35 UTC 2010


Date: Mon, 27 Sep 2010 18:12:43 +0200
From: info at elda.org
Message-ID: <4CA0C27B.1090203 at elda.org>
X-url: http://catalog.elra.info/product_info.php?products_id=1124
X-url: http://catalog.elra.info/product_info.php?products_id=1125
X-url: http://catalog.elra.info/product_info.php?products_id=1126

Our apologies if you have received multiple copies of this announcement.

*****************************************************************
ELRA - Language Resources Catalogue - Update
*****************************************************************

ELRA is happy to announce that 1 new Written Corpus and 2 new Monolingual
Lexicons are now available in its catalogue:

ELRA-W0054 Persian 1984 corpus (Multext-East framework)

This corpus contains the Persian (Farsi) translation of a part of the
novel "1984" (G. Orwell) annotated in the Multext-East framework
(Multilingual Text Tools and Corpora for Eastern and Central European
Languages). The corpus contains approximately 100,000 words (6,604
sentences, 13,247 lemmas), with extensive headers and markup for
document structure, sentences, and various sub-sentence annotations in
the XML-format following the TEI guidelines.
Annotation includes POS (part-of-speech) and lemmas.
For more information, see:
http://catalog.elra.info/product_info.php?products_id=1124

ELRA-L0086 Persian Multext-East framework lexicon

This is a Persian (Farsi) morphosyntactic lexicon derived from the
Persian 1984 corpus (Multext-East framework) (see ELRA-W0054). It
contains the full inflectional paradigms of a superset of lemmas that
appear in the Persian 1984 corpus. Each entry gives the word-form, its
lemma and morphosyntactic description. The lexicon contains 13,247
entries.
For more information, see:
http://catalog.elra.info/product_info.php?products_id=1125

ELRA-L0087 Persian lexicon

This is a Persian (Farsi) lexicon of more than 40,000 entries of
non-inflected forms of words. Each word is transliterated based on the
proposed framework from MBROLA (Text-To-Speech synthesizer). The
database includes a large variety of descriptors for each entry
(plural, homograph, ...). The lexicon is provided in a MS Access
database.
For more information, see:
http://catalog.elra.info/product_info.php?products_id=1126


For more information on the catalogue, please contact Valérie Mapelli
mapelli at elda.org

Visit our On-line Catalogue: http://catalog.elra.info
Visit the Universal Catalogue: http://universal.elra.info
Archives of ELRA Language Resources Catalogue Updates: http://www.elra.info/
LRs-Announcements.html   

-------------------------------------------------------------------------
Message diffuse par la liste Langage Naturel <LN at cines.fr>
Informations, abonnement : http://www.atala.org/article.php3?id_article=48
English version       : 
Archives                 : http://listserv.linguistlist.org/archives/ln.html
                                http://liste.cines.fr/info/ln

La liste LN est parrainee par l'ATALA (Association pour le Traitement
Automatique des Langues)
Information et adhesion  : http://www.atala.org/
-------------------------------------------------------------------------



More information about the Ln mailing list