[Corpora-List] Annoucement: JBootCat v0.2 released

Andy Roberts andyr at comp.leeds.ac.uk
Tue Aug 15 03:10:18 UTC 2006


Dear Corpora readers,

I'm pleased to annouce that JBootCat v0.2 is released - the first public
release from the project.

JBootCat is a Java implementation of the BootCat tool-chain written by
Marco Baroni et al for generating corpora from the Internet. The main
goal is to encapsulate the BootCat functionality within a user-friendly
desktop application. The advantae of using the Java platform is that
JBootCat can be run easily on most major operating systems.

JBootCat is free and open source. It is released under the LGPL.

As you may guess from the version number, there's still a lot of work to
do and the interface is a little rough around the edges. However, there
is sufficient functionality to acquire a corpus via Google and download,
clean and tokenise.

All the information, including screenshots, can be found on the project
home page:

http://www.andy-roberts.net/software/jbootcat/

Feedback is gratefully accepted :)

Regards,
Andy Roberts



More information about the Corpora mailing list