Hi Matías,<div><br></div><div>I would suggest you NLTK for Python. You can start using the book published by O'Reilly, it's very easy and effective. It fits your needs.</div><div><br></div><div>Bye,</div><div>michele.</div>

<div><br><div class="gmail_quote">On Mon, Aug 20, 2012 at 1:04 AM, Matías Guzmán <span dir="ltr"><<a href="mailto:mortem.dei@gmail.com" target="_blank">mortem.dei@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

Dear all,<br><br>I'm not a very strong programmer but I know a bit of python and a bit of R, and I was wandering which is better for corpus work. I'm not interesting in creating any fancy language technology thingy, I just need to extract raw text from documents off and on-line, analyze them and perform some basic statistics on them. Which one would you recommend? should I use both?<br>


<br>Thanks,<br><br>Matías Guzmán<br>

<br>_______________________________________________<br>

UNSUBSCRIBE from this page: <a href="http://mailman.uib.no/options/corpora" target="_blank">http://mailman.uib.no/options/corpora</a><br>

Corpora mailing list<br>

<a href="mailto:Corpora@uib.no">Corpora@uib.no</a><br>

<a href="http://mailman.uib.no/listinfo/corpora" target="_blank">http://mailman.uib.no/listinfo/corpora</a><br>

<br></blockquote></div><br><br clear="all"><div><br></div>-- <br>Michele Filannino<br><br><font color="#666666">CDT PhD student in Computer Science<br>Room IT301 - IT Building<br>The University of Manchester<br><a href="mailto:filannim@cs.manchester.ac.uk" target="_blank">filannim@cs.manchester.ac.uk</a></font><br>


</div>