[Corpora-List] Looking for developers and testers: Open source corpus interface (Glossa)

lars nygaard lars.nygaard at iln.uio.no
Tue Jan 30 17:16:37 UTC 2007


Hi everyone,

At The Text Laboratory (http://www.hf.uio.no/tekstlab/English) we have 
developed an advanced, web-based corpus interface called "Glossa". Features:

- supports both monolingual and multilingual corpora
- supports integration audio and video playback
- advanced postprocessing options (collocations, co-occurence, sorting, 
editing, downloading etc.)
- gui-based query builder supporting the full flexibility of the CWB 
query language (almost), and arbitrarily complex metadata restrictions
- separate interface for compiling lexical statistics

Corpus Workbench (http://cwb.sf.net) and MySQL[1] (http://mysql.com) are 
used to query the actual data.

You can try it out on a tiny test corpus here:

http://omilia.uio.no/glossa/html/index_dev.php?corpus=test

and read the user documentation here:

http://omilia.uio.no/glossa/html/GLOSSA_manual.html

Please note that this version of Glossa is still not complete, and may 
contain quite a few bugs.

If you would like to
- use Glossa for corpora that you have available
- contribute to the development of the source code
- test the system and report errors
please write to lars.nygaard at iln.uio.no. I will mail source code to 
interested parties. The code should also be available for download on 
the web in the not-to-distant future.

best regards,
lars nygaard

PS: We would like to thank all those who have helped in various ways in 
the development of the interfaces: Paul Meurer, Eckhard Bick, Hilde 
Hasselgård, Stig Johansson, Cathrine Fabricius-Hansen, Elisabeth Lien, 
Ruth Vatvedt Fjeld, Trond Trosterud and others.

[1] It should, however, be trivial to patch Glossa for use with other RDBMs.



More information about the Corpora mailing list