[Corpora-List] question about storage of corpora

Tine Lassen tine.lassen at tdcadsl.dk
Fri May 27 13:14:25 UTC 2011


Hi,
I am in the process of compiling a series of domain corpora, and once the
present text gathering phase is completed, of course i need to store the
texts somehow. The texts need to be annotated with e.g. parts of spech
and posssibly phrase boundaries for term extraction purposes.
My questions are: Would it be wiser to store the texts as XML or in a
relational database format?Does a generally accepted corpus annotation
XML-schema exist? And do tools for annotation of and search in such files
exists?How do you store your corpora?
Any thoughts or ideas regarding the questions are very welcome :)
Best,Tine LassenCopenhagen Business School

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20110527/9ccfe274/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list