[Corpora-List] A tool for corpus management?
Hardie, Andrew
a.hardie at lancaster.ac.uk
Thu Aug 26 12:44:01 UTC 2010
Hi Mahdi,
Noam Orden recommended Corpus Workbench, but it's worth nothing that CWB
won't help you with actually creating and editing your files (the first
four points on your list); it's an indexing and search tool. But it does
support UTF8, as of the most recent version, and it handles corpora on
the order of a hundred million words very readily. See
http://cwb.sourceforge.net/
best
Andrew Hardie
Linguistics & English Language
County South
Lancaster University
Lancaster LA1 4YL
United Kingdom
http://www.ling.lancs.ac.uk/staff/hardie
<http://www.ling.lancs.ac.uk/staff/hardie>
From: corpora-bounces at uib.no [mailto:corpora-bounces at uib.no] On Behalf
Of Mahdi Mohseni
Sent: 26 August 2010 12:05
To: corpora at uib.no
Subject: Re: [Corpora-List] A tool for corpus management?
Thanks to all.
Are these tools supports Unicode texts?
And another problem: the corpus has up to 100 million words. So, are
these tools manage this volume of texts easily (especially in search and
retrieval)?
I appreciate your response.
Mahdi
On Wed, Aug 25, 2010 at 3:36 PM, Mahdi Mohseni <mohseni48 at gmail.com>
wrote:
Dear Colleagues,
I need a tool for managing a corpus with the following capabilities:
* Adding text files to the corpus
* Editing files
* Annotating words
* Searching
* Reporting statistics of words and tags
Would you please introduce me a suitable tool?
Best,
Mahdi Mohseni
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20100826/0e0bd9f3/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list