[Corpora-List] A tool for corpus management?

Sérgio Matos aleixomatos at ua.pt
Thu Aug 26 12:25:38 UTC 2010


I would suggest NooJ (nooj4nlp.net).
It's based on Finite-State methods, and implemented in C, so I'd expect 
very good performance.

Regards,
Sérgio




On 08/26/2010 12:04 PM, Mahdi Mohseni wrote:
> Thanks to all.
>
> Are these tools supports Unicode texts?
> And another problem: the corpus has up to 100 million words. So, are 
> these tools manage this volume of texts easily (especially in search 
> and retrieval)?
>
> I appreciate your response.
> Mahdi
>
> On Wed, Aug 25, 2010 at 3:36 PM, Mahdi Mohseni <mohseni48 at gmail.com 
> <mailto:mohseni48 at gmail.com>> wrote:
>
>     Dear Colleagues,
>
>     I need a tool for managing a corpus with the following capabilities:
>
>         * Adding text files to the corpus
>         * Editing files
>         * Annotating words
>         * Searching
>         * Reporting statistics of words and tags
>
>     Would you please introduce me a suitable tool?
>
>     Best,
>     Mahdi Mohseni
>
>
>
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>    

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20100826/b83a88f2/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list