[Corpora-List] Corpus Development

Serge HEIDEN Slh at ens-lsh.fr
Mon Apr 28 13:28:47 UTC 2008


Mark,

Le Sunday, April 27, 2008 6:44 PM [GMT+1=CET],
Mark Davies <Mark_Davies at byu.edu> a écrit :
> Most really large corpora that I'm aware of do use a relational
> database architecture, including systems like IMS Corpus Workbench.

The IMS Corpus Workbench software's architecture is based on
specific indexing technics related to textual data processing and querying.
Those techniques where described in the book :
"Managing GigabytesCompressing and Indexing Documents and Images"
De Ian H. Witten, Alistair Moffat, Timothy C. Bell, 1999, Morgan Kaufmann.
No RDBMS system or architecture the-like was used and this can
be seen from the source : http://cwb.sourceforge.net/

Best,
Serge

_____________________________________________________________
Serge Heiden, slh at ens-lsh.fr, https://weblex.ens-lsh.fr
ENS-LSH/CNRS - ICAR UMR5191, Institut de Linguistique Française
15, parvis René Descartes 69342 Lyon BP7000 Cedex, tél. +33(0)622003883


_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list