[Corpora-List] Looking for developers and testers: Open source corpus interface (Glossa)
lars nygaard
lars.nygaard at iln.uio.no
Tue Jan 30 17:16:37 UTC 2007
Hi everyone,
At The Text Laboratory (http://www.hf.uio.no/tekstlab/English) we have
developed an advanced, web-based corpus interface called "Glossa". Features:
- supports both monolingual and multilingual corpora
- supports integration audio and video playback
- advanced postprocessing options (collocations, co-occurence, sorting,
editing, downloading etc.)
- gui-based query builder supporting the full flexibility of the CWB
query language (almost), and arbitrarily complex metadata restrictions
- separate interface for compiling lexical statistics
Corpus Workbench (http://cwb.sf.net) and MySQL[1] (http://mysql.com) are
used to query the actual data.
You can try it out on a tiny test corpus here:
http://omilia.uio.no/glossa/html/index_dev.php?corpus=test
and read the user documentation here:
http://omilia.uio.no/glossa/html/GLOSSA_manual.html
Please note that this version of Glossa is still not complete, and may
contain quite a few bugs.
If you would like to
- use Glossa for corpora that you have available
- contribute to the development of the source code
- test the system and report errors
please write to lars.nygaard at iln.uio.no. I will mail source code to
interested parties. The code should also be available for download on
the web in the not-to-distant future.
best regards,
lars nygaard
PS: We would like to thank all those who have helped in various ways in
the development of the interfaces: Paul Meurer, Eckhard Bick, Hilde
Hasselgård, Stig Johansson, Cathrine Fabricius-Hansen, Elisabeth Lien,
Ruth Vatvedt Fjeld, Trond Trosterud and others.
[1] It should, however, be trivial to patch Glossa for use with other RDBMs.
More information about the Corpora
mailing list