Arabic-L:LING:Hayat Corpus
Dilworth B. Parkinson
Dilworth_Parkinson at byu.edu
Tue Jan 15 19:01:16 UTC 2002
----------------------------------------------------------------------
Arabic-L: Wed 15 Jan 2002
Moderator: Dilworth Parkinson <dilworth_parkinson at byu.edu>
[To post messages to the list, send them to arabic-l at byu.edu]
[To unsubscribe, send message to listserv at byu.edu with first line reading:
unsubscribe arabic-l ]
-------------------------Directory-------------------------------------
1) Subject: Hayat Corpus
-------------------------Messages--------------------------------------
1)
Date: 15 Jan 2002
From: Magali Duclaux <duclaux at elda.fr>
Subject: Hayat Corpus
************************************************************
ELRA - European Language Resources Association
************************************************************
We are pleased to announce the new resources
available in our catalogue of language resources:
ELRA W0030 Arabic Data Set
ELRA W0031 GeFRePaC - German French Reciprocal
Parallel Corpus
A short description of these two new resources is given
below.
Please visit the online catalogue to get further details:
http://www.elda.fr/catalog.html
ELRA W0030 Arabic Data Set:
The corpus contains Al-Hayat newspaper articles with
value added for Language Engineering and Information
Retrieval applications development purposes. Data has
been organised in 7 subject specific databases according
to the Al-Hayat subject tags. Mark-up, numbers, special
characters and punctuation have been removed. The size
of the total file is 268 MB. The dataset contains 18,639,264
distinct tokens in 42,591 articles, organised in 7 domains.
ELRA W0031 GeFRePaC - German French Reciprocal
Parallel Corpus:
GeFRePac was produced in the framework of the LRsP&P
project. It contains 30 million words : 15 million for the
German language, 15 million for the French language.
It covers natural general language as used in
public socio-political discourse and it has a focus on
multilingual administration and commercial and legal
documentation. It was created for the purpose of
developing, enhancing and improving translation aids.
=====================================
For further information, please contact:
ELRA/ELDA
55-57 rue Brillat-Savarin
F-75013 Paris, France
Tel: +33 01 43 13 33 33
Fax: +33 01 43 13 33 30
E-mail mapelli at elda.fr
or visit our Web site:
http://www.icp.grenet.fr/ELRA/home.html
or http://www.elda.fr
=====================================
--------------------------------------------------------------------------
End of Arabic-L: 15 Jan 2002
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/arabic-l/attachments/20020115/530e310b/attachment.htm>
More information about the Arabic-l
mailing list