[Corpora-List] Handling a Large Text Archive

Hardie, Andrew a.hardie at lancaster.ac.uk
Wed Jan 4 16:26:49 UTC 2012


Hi Muhammad,

You don’t need tagged data from CQPweb; it’s quite happy with untagged text as long as it’s in tokenised (one token or XML tag per line) form.

Also, don’t forget that CQPweb sits on top of CWB (Corpus Workbench), which can be used from the command line without setting up the web interface if that’s better for your needs. See http://cwb.sourceforge.net/install.php

and the place to ask questions if you get stuck is http://devel.sslmit.unibo.it/mailman/listinfo/cwb.

best

Andrew Hardie
(CQPweb developer)

From: corpora-bounces at uib.no [mailto:corpora-bounces at uib.no] On Behalf Of True Friend
Sent: 04 January 2012 15:17
To: Emiliano Guevara; corpora
Subject: Re: [Corpora-List] Handling a Large Text Archive

I use C# for writing small script like programs which i use to process data. But writing code even for getting concordance of single word is a bit daunting. So i was looking for something ready made, click and run type.
I am familiar with CQPWeb, worked with it while having a research using BE06 corpus. But didn't use it to manage a corpus. Perhaps it'll become complex for me (text archive is not tagged). Well, i'll give it a try.
Regards


--
Muhammad Shakir Aziz محمد شاکر عزیز
Master in Applied Linguistics
Translator, Course Developer, Linguist for Urdu, Punjabi and English
Urdu:- http://awaz-e-dost.blogspot.com/
English:- http://linguisticslearner.blogspot.com/
Facebook:- http://www.facebook.com/truefriend2004
Skype:- true_friend2004

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20120104/eef23910/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list