Arabic-L:LING:Hayat Corpus

Dilworth B. Parkinson Dilworth_Parkinson at byu.edu
Tue Jan 15 19:01:16 UTC 2002


----------------------------------------------------------------------
Arabic-L: Wed 15 Jan 2002
Moderator: Dilworth Parkinson <dilworth_parkinson at byu.edu>
[To post messages to the list, send them to arabic-l at byu.edu]
[To unsubscribe, send message to listserv at byu.edu with first line reading:
           unsubscribe arabic-l                                      ]

-------------------------Directory-------------------------------------

1) Subject: Hayat Corpus

-------------------------Messages--------------------------------------
1)
Date:  15 Jan 2002
From: Magali Duclaux <duclaux at elda.fr>
Subject: Hayat Corpus

************************************************************
ELRA - European Language Resources Association
************************************************************

We are pleased to announce the new resources
available in our catalogue of language resources:

ELRA W0030 Arabic Data Set
ELRA W0031 GeFRePaC - German French Reciprocal
Parallel Corpus

A short description of these two new resources is given
below.
Please visit the online catalogue to get further details:
http://www.elda.fr/catalog.html

ELRA W0030 Arabic Data Set:
The corpus contains Al-Hayat newspaper articles with
value added for Language Engineering and Information
Retrieval applications development purposes. Data has
been organised in 7 subject specific databases according
to the Al-Hayat subject tags. Mark-up, numbers, special
characters and punctuation have been removed. The size
of the total file is 268 MB. The dataset contains 18,639,264
distinct tokens in 42,591 articles, organised in 7 domains.

ELRA W0031 GeFRePaC - German French Reciprocal
Parallel Corpus:
GeFRePac was produced in the framework of the LRsP&P
project. It contains 30 million words : 15 million for the
German language, 15 million for the French language.
It covers natural general language as used in
public socio-political discourse and it has a focus on
multilingual administration and commercial and legal
documentation. It was created for the purpose of
developing, enhancing and improving translation aids.

=====================================
For further information, please contact:

ELRA/ELDA
55-57 rue Brillat-Savarin
F-75013 Paris, France

Tel:	+33 01 43 13 33 33
Fax:	+33 01 43 13 33 30

E-mail mapelli at elda.fr

or visit our Web site:
http://www.icp.grenet.fr/ELRA/home.html
or http://www.elda.fr
=====================================

--------------------------------------------------------------------------
End of Arabic-L:  15 Jan 2002
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/arabic-l/attachments/20020115/530e310b/attachment.htm>


More information about the Arabic-l mailing list