Corpora: Announcing a large Portuguese corpus
Diana Maria de Sousa Marques Pinto dos Santos
Diana.Santos at informatics.sintef.no
Tue Sep 5 10:48:27 UTC 2000
Dear members of the corpora list,
We would like to announce the release of CETEMPúblico, a large corpus
(approx. 180 million words) of Portuguese newspaper language from the
Portuguese daily newspaper Público, created by our project as another
initiative to foster R&D in the processing of the Portuguese language.
Please see the corpus page for further details on distribution and
availability:
http://cgi.portugues.mct.pt/cetempublico/
Diana Santos & Paulo Rocha
Computational processing of Portuguese
http://www.portugues.mct.pt/
SINTEF Telecom and Informatics
Box 124 Blindern, N-0314 Oslo, Norway
projecto at informatics.sintef.no
More information about the Corpora
mailing list