[Corpora-List] ELRA - Language Resources Catalogue - Update
info at elda.org
info at elda.org
Wed Jan 15 15:17:30 UTC 2014
[Apologies for cross-postings]
We are happy to announce that 2 new Written Corpora are now available in our catalogue.
Those corpora are part of the Nepali National Corpus, which was produced in 2006 in the framework of the project Bhasha Sanchar (“language communication”), also known as Nelralec, for Nepali Language Resources and Localization for Education and Communication; funded by the EU Asia IT&C programme, reference number ASIE/2004/091-777.
ELRA-W0076 Nepali Monolingual written corpus
The Nepali Monolingual written corpus comprises the core corpus (core sample) and the general corpus. The core sample (CS) represents the collection of Nepali written texts from 15 different genres with 2000 words each published between 1990 and 1992. It is based on FLOB/FROWN corpora and contains 802,000 words. The general corpus (GC) consists of written texts collected opportunistically from a wide range of sources such as the internet webs, newspapers, books, publishers and authors. It contains 1,400,000 words.
For more information, see: http://catalog.elra.info/product_info.php?products_id=1216
ELRA-W0077 English-Nepali Parallel Corpus
This corpus consists of a collection of national development texts in English and Nepali. A small set of data is aligned at the sentence level (27,060 English words; 21,756 Nepali words), and a larger set of texts at the document level (617,340 English words; 596,571 Nepali words). An additional set of monolingual data in Nepali is also provided (386,879 words in Nepali).
For more information, see: http://catalog.elra.info/product_info.php?products_id=1217
For more information on the catalogue, please contact Valérie Mapelli mailto:mapelli at elda.org
Visit our On-line Catalogue: http://catalog.elra.info
Visit the Universal Catalogue: http://universal.elra.info
Archives of ELRA Language Resources Catalogue Updates: http://www.elra.info/LRs-Announcements.html
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list