[Corpora-List] corpus software
Stefan Evert
stefanML at collocations.de
Fri Apr 23 19:30:31 UTC 2010
Dear corpora subscribers,
I'd like to use this opportunity to promote our public beta testing
programme for the Open Corpus Workbench (CWB).
> In Menota (as in all corpora I have been involved in the development
> of or,) the Corpus Linguist Workbench (CLW/CQP) from Univ. of
> Stuttgart is the standard choice of corpus search system. However,
> CLW/CQP is old and has only been maintained and not developed the
> last 10 years( I know ab out the open corpus workbench initative)
That's not quite true, even though progress has admittedly been slow
and sporadic, and the official release of version 3.0 is more than 10
years late by now ... :-}
However, many bug fixes and new features have been added to the CWB
during this time, and since 2008 there are 64-bit versions for Linux
and Mac OS X that can handle corpora of up to 2 billion tokens.
> For example the unicode support is meager.
We[1] are currently working on two new versions of the CWB, even
though 3.0 has not _quite_ been released yet:
v3.1 -- native Windows port based on work by the Textometrie project
v3.2 -- full Unicode (UTF-8) support
Version 3.1 is ready for public beta testing, so we would like to ask
any CWB users who are interested in the Windows platform (or have some
time to spare and access to a Windows machine) to play around with it
and discover all the bugs we haven't found yet. Version 3.2 will
follow soon (possibly in a less mature alpha release, so that we can
test each new feature as it's added).
If you're interested in becoming a beta tester for the CWB, follow the
instructions on this page:
http://cwb.sourceforge.net/beta.php
Best regards, and thanks in advanced for helping us!
Stefan Evert & Andrew Hardie
[1] That is, Andrew Hardie is doing all the hard work, while I'm
playing supervisor and giving instructions. :-)
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list