[Corpora-List] How to download text from the web to build a corpus ?

Imene Bensalem bens.imene at gmail.com
Thu Jun 21 09:25:29 UTC 2012


Dear all,
I would build a corpus of Arabic text, and I would ask you about tools you
know to  download text (or html pages) form the source websites.
I tried to use WinHTTrak to download pages form Wikipedia but
it always show me an error and did download anything.
Thank you
Best regards

Imene Bensalem
Mentouri University, Constantine , Algeria
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20120621/ca92038e/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list