[Corpora-List] Getting articles from newspapers to compile a corpus

Matías Guzmán mortem.dei at gmail.com
Thu Nov 29 18:21:11 UTC 2012


Hi all,

I was wondering if anyone knows how to get every possible article from
online newspapers and magazines. I was thinking something like giving a
program the URL of the newspaper (e.g. www.eltiempo.com) and getting the
text from all pages therein. Is that possible?

Thanks a lot,

Matías
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20121129/1691d022/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list