Corpora: Announcing CQP demo of CETEMPublico
Santos Diana
Diana.Santos at informatics.sintef.no
Fri Mar 16 17:22:48 UTC 2001
Dear corpora list members,
We would like to announce that the CQP demo for use with the CETEMPúblico
corpus is ready.
A CQP demo is basically an instance of a powerful corpus environment system,
the IMS Corpus Workbench, restricted to a single corpus (in this case
version 1.4 of CETEMPúblico).
CETEMPúblico is a large (180 million words) corpus of Portuguese newspaper
language from the daily newspaper PÚBLICO, created by the Computational
Processing of Portuguese project, which is free for all purposes except
resale. See http://cgi.portugues.mct.pt/cetempublico/ for more information
about the corpus and how to get the CQP demo.
The IMS CWB is a corpus environment system running in Unix/Linux that has
significant linguistically motivated query features together with a
computationally efficient implementation, which tackles corpora of sizes <
300M words. (Most other CES I know of do not handle corpora as big as
CETEMPúblico at all.) See
http://www.ims.uni-stuttgart.de/projekte/CorpusWorkbench for more
information.
Even though CQP requires a research license, the CQP demo does not. In
addition, and for people who are mainly interested in this corpus and not in
using CQP in general, is the most practical and friendly way to get good
search capabilities without having to do a single line of programming.
I should note that this CQP demo is based on the last beta version of the
IMS CWB, which features significant improvements (and I'm told that a new
Corpus Workbench release will be announced very soon in this list).
Diana Santos
************************************************************************
Diana Santos Computational processing of Portuguese
SINTEF Telecom & Informatics Tel. (direct line) +47 22 06 73 12
Forskningsveien 1 Tel. +47 22 06 73 00
Box 124 Blindern Fax. +47 22 06 73 50
N-0314 Oslo Email:
Diana.Santos at informatics.sintef.no
Norway http://www.portugues.mct.pt/
************************************************************************
More information about the Corpora
mailing list