[Corpora-List] Open-source corpus query tools

Grzegorz Chrupała pitekus at gmail.com
Wed Dec 29 19:21:48 UTC 2004


On Wed, 29 Dec 2004 13:42:32 -0500, Detmar Meurers <dm at ling.osu.edu> wrote:
> Hi Grzegorz,
> I'd be curious which corpus query tools people suggested - could you
> send me a list, or post a summary to the list?

Summary follows:

Sylvain Loiseau suggested XQuery langage, which is a w3c standard: 
"XQuery is less simple than the query langage of Corpus Query
Processor, but not so much. It is supported by many open source
software (Saxon by M. Kay for
instance). Transforming result into html for rendition is quite simple."

Lou Burnard mentioned www.xaira.org: "The open source version will be
available early next year, but the current (windows only) version is
available for download and evaluation right now."

David Reitter  suggested NITE XML Toolkit
(http://www.ltg.ed.ac.uk/NITE/): "There is
a nice query language, the representation format is well worked out
(supports time-alignment), there is a useful library with an API for
Java (read / run queries etc.) and they have components that allow you
to throw together GUI based annotation tools"

Klaus Guenther told me about Xlex ( http://xlex.uni-muenster.de/ ), a
web-based interface written in Perl. Apparently this is postcard-ware,
and you can make modifications to the source.

Detmar Meurers recommended TIGERSearch, freely available
http://www.ims.uni-stuttgart.de/projekte/TIGER/TIGERSearch/. This
seems to be free for non-commercial uses, but not open-source.

I haven't yet evaluated all those tools yet. However I was intrigued
by the suggestion to use the general-purpose XML Query language for
implementing corpus query. As I not really familiar with XQuery I was
wondering whether anyone else had success using that for corpus work.

Thanks again to all who took time to respond. 
Cheers and Happy New Year,
-- 
Grzegorz Chrupała



More information about the Corpora mailing list