[Corpora-List] A tool for corpus management?

Hardie, Andrew a.hardie at lancaster.ac.uk
Thu Aug 26 12:44:01 UTC 2010


Hi Mahdi,

 

Noam Orden recommended Corpus Workbench, but it's worth nothing that CWB
won't help you with actually creating and editing your files (the first
four points on your list); it's an indexing and search tool. But it does
support UTF8, as of the most recent version, and it handles corpora on
the order of a hundred million words very readily. See
http://cwb.sourceforge.net/ 

 

best

 

Andrew Hardie

Linguistics & English Language

County South

Lancaster University

Lancaster LA1 4YL

United Kingdom

 

http://www.ling.lancs.ac.uk/staff/hardie
<http://www.ling.lancs.ac.uk/staff/hardie> 

 

 

 

From: corpora-bounces at uib.no [mailto:corpora-bounces at uib.no] On Behalf
Of Mahdi Mohseni
Sent: 26 August 2010 12:05
To: corpora at uib.no
Subject: Re: [Corpora-List] A tool for corpus management?

 

Thanks to all. 

Are these tools supports Unicode texts?
And another problem: the corpus has up to 100 million words. So, are
these tools manage this volume of texts easily (especially in search and
retrieval)?

I appreciate your response.
Mahdi

On Wed, Aug 25, 2010 at 3:36 PM, Mahdi Mohseni <mohseni48 at gmail.com>
wrote:

Dear Colleagues,

I need a tool for managing a corpus with the following capabilities:

*	Adding text files to the corpus
*	Editing files
*	Annotating words
*	Searching
*	Reporting statistics of words and tags

Would you please introduce me a suitable tool?

Best,
Mahdi Mohseni
  

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20100826/0e0bd9f3/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list