Info: ELRA - Language Resources Catalogue - Update
Thierry Hamon
thierry.hamon at UNIV-PARIS13.FR
Wed Sep 29 07:54:35 UTC 2010
Date: Mon, 27 Sep 2010 18:12:43 +0200
From: info at elda.org
Message-ID: <4CA0C27B.1090203 at elda.org>
X-url: http://catalog.elra.info/product_info.php?products_id=1124
X-url: http://catalog.elra.info/product_info.php?products_id=1125
X-url: http://catalog.elra.info/product_info.php?products_id=1126
Our apologies if you have received multiple copies of this announcement.
*****************************************************************
ELRA - Language Resources Catalogue - Update
*****************************************************************
ELRA is happy to announce that 1 new Written Corpus and 2 new Monolingual
Lexicons are now available in its catalogue:
ELRA-W0054 Persian 1984 corpus (Multext-East framework)
This corpus contains the Persian (Farsi) translation of a part of the
novel "1984" (G. Orwell) annotated in the Multext-East framework
(Multilingual Text Tools and Corpora for Eastern and Central European
Languages). The corpus contains approximately 100,000 words (6,604
sentences, 13,247 lemmas), with extensive headers and markup for
document structure, sentences, and various sub-sentence annotations in
the XML-format following the TEI guidelines.
Annotation includes POS (part-of-speech) and lemmas.
For more information, see:
http://catalog.elra.info/product_info.php?products_id=1124
ELRA-L0086 Persian Multext-East framework lexicon
This is a Persian (Farsi) morphosyntactic lexicon derived from the
Persian 1984 corpus (Multext-East framework) (see ELRA-W0054). It
contains the full inflectional paradigms of a superset of lemmas that
appear in the Persian 1984 corpus. Each entry gives the word-form, its
lemma and morphosyntactic description. The lexicon contains 13,247
entries.
For more information, see:
http://catalog.elra.info/product_info.php?products_id=1125
ELRA-L0087 Persian lexicon
This is a Persian (Farsi) lexicon of more than 40,000 entries of
non-inflected forms of words. Each word is transliterated based on the
proposed framework from MBROLA (Text-To-Speech synthesizer). The
database includes a large variety of descriptors for each entry
(plural, homograph, ...). The lexicon is provided in a MS Access
database.
For more information, see:
http://catalog.elra.info/product_info.php?products_id=1126
For more information on the catalogue, please contact Valérie Mapelli
mapelli at elda.org
Visit our On-line Catalogue: http://catalog.elra.info
Visit the Universal Catalogue: http://universal.elra.info
Archives of ELRA Language Resources Catalogue Updates: http://www.elra.info/
LRs-Announcements.html
-------------------------------------------------------------------------
Message diffuse par la liste Langage Naturel <LN at cines.fr>
Informations, abonnement : http://www.atala.org/article.php3?id_article=48
English version :
Archives : http://listserv.linguistlist.org/archives/ln.html
http://liste.cines.fr/info/ln
La liste LN est parrainee par l'ATALA (Association pour le Traitement
Automatique des Langues)
Information et adhesion : http://www.atala.org/
-------------------------------------------------------------------------
More information about the Ln
mailing list