[Corpora-List] A tool for corpus management?
Alberto Simões
albie at alfarrabio.di.uminho.pt
Thu Aug 26 14:28:18 UTC 2010
On 26/08/2010 13:25, Sérgio Matos wrote:
> I would suggest NooJ (nooj4nlp.net).
> It's based on Finite-State methods, and implemented in C, so I'd expect
> very good performance.
Implemented in C# making it hard to use on Unix based machines.
But a relevant tool, in any case.
Cheers
>
> Regards,
> Sérgio
>
>
>
>
> On 08/26/2010 12:04 PM, Mahdi Mohseni wrote:
>> Thanks to all.
>>
>> Are these tools supports Unicode texts?
>> And another problem: the corpus has up to 100 million words. So, are
>> these tools manage this volume of texts easily (especially in search
>> and retrieval)?
>>
>> I appreciate your response.
>> Mahdi
>>
>> On Wed, Aug 25, 2010 at 3:36 PM, Mahdi Mohseni <mohseni48 at gmail.com
>> <mailto:mohseni48 at gmail.com>> wrote:
>>
>> Dear Colleagues,
>>
>> I need a tool for managing a corpus with the following capabilities:
>>
>> * Adding text files to the corpus
>> * Editing files
>> * Annotating words
>> * Searching
>> * Reporting statistics of words and tags
>>
>> Would you please introduce me a suitable tool?
>>
>> Best,
>> Mahdi Mohseni
>>
>>
>>
>>
>> _______________________________________________
>> Corpora mailing list
>> Corpora at uib.no
>> http://mailman.uib.no/listinfo/corpora
>>
>
>
>
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
--
Alberto Simões
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list