Corpora: New Release from the LDC
LDC Office
ldc at ldc.upenn.edu
Thu Dec 6 18:40:53 UTC 2001
** CETEMPúblico Version 1.7 **
The Linguistic Data Consortium (LDC) is pleased to announce the
availability of CETEMPúblico Version 1.7.
http://www.ldc.upenn.edu/Catalog/LDC2001T62.html
CETEMPúblico (Corpus de Extractos de Textos Electrónicos MCT/Público), a
single CD-ROM publication, contains newspaper text of the Portuguese
daily newspaper, PÚBLICO. It was created by the Computational
Processing of Portuguese project through an agreement between PÚBLICO
and the Portuguese Ministry of Science and Technology (MCT).
The material includes roughly 2,600 editions of PÚBLICO, dating from
1991 to 1998 and amounting to approximately 180 million words.
CETEMPúblico is intended for research and development in natural
language processing (NLP); additionally, it is suitable for other
Portuguese language research.
For more detailed information, please visit:
in Portuguese:
http://cgi.portugues.mct.pt/cetempublico/
in English:
http://cgi.portugues.mct.pt/cetempublico/whatisCETEMP.html
Institutions that have membership in the LDC during the 2001
Membership Year will be able to receive this corpus free of charge.
Nonmembers may purchase this publication for $200.
** Please note that a signed user agreement is required for both member
and nonmember requests. **
If you need additional information before placing your order, or
would like to inquire about membership in the LDC, please send email to
<ldc at ldc.upenn.edu> or call (215) 573-1275.
---------------------------------------------------------------------
Linguistic Data Consortium Phone: (215) 573-1275
3615 Market Street Fax: (215) 573-2175
Suite 200 email: ldc at unagi.cis.upenn.edu
Philadelphia, PA 19104-2608 www: http://www.ldc.upenn.edu
More information about the Corpora
mailing list