Corpora: Available for download: Gsearch Corpus Query System

Frank Keller keller at CoLi.Uni-SB.DE
Fri Sep 14 10:15:42 UTC 2001


We are pleased to announce the immediate availability of Gsearch 2.06,
free of charge for research purposes.

The Gsearch corpus query system allows the selection of sentences by
syntactic criteria from text corpora, even when these corpora contain
no prior syntactic markup. This is achieved by means of a fast chart
parser, which takes as input a grammar and a search expression
specified by the user.

Among the major features of Gsearch are:

* runs under Solaris, Linux, and MacOS X;

* simple to install, based on GNU automake/autoconf;

* supports standard corpora (including BNC, Brown, Susanne, WSJ,
  Frankfurter Rundschau, Negra);

* can be easily extended to new corpora;

* supports standard taggers (LT POS, TnT);

* interfaces with external linguistic resources such as WordNet;

* outputs syntax trees in SGML, but also interfaces with external
  visualization tools (Viewtree, Thistle);

* comes with a tool for random sampling of Gsearch output.

For more information about Gsearch, and to download the latest
version, please visit:

Bug reports, suggestions for enhancements should be sent to:

gsearch-dev at


Gsearch Deveopment Team
Martin Corley, University of Edinburgh
Frank Keller, Saarland University

More information about the Corpora mailing list