Corpora: ELRA news 1/2

Magali Duclaux duclaux at elda.fr
Wed Aug 1 08:50:46 UTC 2001


[ We apologise for the duplicate posting of this announcement ]

*************************************************************
ELRA
European Language Resources Association
ELRA News
*************************************************************

We are happy to announce new resources available via ELRA:

ELRA S0111      Logotypografia database - Eleftherotypia Journal speech 
database
ELRA S0112      Persian speech database Farsdat
ELRA W0027      An-Nahar Newspaper text corpus

A description of each database is given below:

ELRA S0111      Logotypografia database - Eleftherotypia Journal speech 
database

The Eleftherotypia Journal speech database consists of Greek read
material. It includes the recordings of 120 speakers, male and female,
for about 72 hours of speech material.

ELRA S0112      Persian speech database - Farsdat

The Persian Speech Database comprises the recordings of 300
native speakers, from 10 different dialect regions of Iran. 6000
utterances were segmented and labelled, including 386 phonetically
balanced sentences.

ELRA W0027      An-Nahar Newspaper text corpus

The An-Nahar Newspaper Text Corpus comprises articles in Arabic
(Lebanon) from 1995 to 2000 (6 years) stored as HTML files on
CDRom media. Each year contains 45 000 articles and 24 million
words.

=====================================
For further information, please contact:

ELRA/ELDA
55-57 rue Brillat-Savarin
F-75013 Paris, France

Tél. : +33 01 43 13 33 33
Fax : +33 01 43 13 33 30

Email: mapelli at elda.fr

or consult our catalogue at the following address:
http://www.icp.grenet.fr/ELRA/home.html
or http://www.elda.fr
=====================================



More information about the Corpora mailing list