<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#ffffff">
I would suggest NooJ (nooj4nlp.net).<br>
It's based on Finite-State methods, and implemented in C, so I'd expect
very good performance.<br>
<br>
Regards,<br>
Sérgio<br>
<br>
<br>
<br>
<br>
On 08/26/2010 12:04 PM, Mahdi Mohseni wrote:
<blockquote
cite="mid:AANLkTikDtef9+zuh4C3FC2nUWZVjxeTC=n0JCubYqd-O@mail.gmail.com"
type="cite">Thanks to all. <br>
<br>
Are these tools supports Unicode texts?<br>
And another problem: the corpus has up to 100 million words. So, are
these tools manage this volume of texts easily (especially in search
and retrieval)?<br>
<br>
I appreciate your response.<br>
Mahdi<br>
<br>
<div class="gmail_quote">On Wed, Aug 25, 2010 at 3:36 PM, Mahdi
Mohseni <span dir="ltr"><<a moz-do-not-send="true"
href="mailto:mohseni48@gmail.com">mohseni48@gmail.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote"
style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">Dear
Colleagues,<br>
<br>
I need a tool for managing a corpus with the following capabilities:<br>
<ul>
<li>Adding text files to the corpus</li>
<li>Editing files</li>
<li>Annotating words</li>
<li>Searching</li>
<li>Reporting statistics of words and tags</li>
</ul>
Would you please introduce me a suitable tool?<br>
<br>
Best,<br>
<font color="#888888">Mahdi Mohseni<br>
</font></blockquote>
</div>
<br>
<pre wrap="">
<fieldset class="mimeAttachmentHeader"></fieldset>
_______________________________________________
Corpora mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Corpora@uib.no">Corpora@uib.no</a>
<a class="moz-txt-link-freetext" href="http://mailman.uib.no/listinfo/corpora">http://mailman.uib.no/listinfo/corpora</a>
</pre>
</blockquote>
<br>
</body>
</html>