[Corpora-List] Getting articles from newspapers to compile a corpus

Sérgio Matos aleixomatos at ua.pt
Thu Nov 29 19:39:07 UTC 2012


This may be helpful:
http://sing.ei.uvigo.es/jarvest/index.html  


--  
Sérgio Matos
IEETA
Universidade de Aveiro



On Thursday, November 29, 2012 at 6:21 PM, Matías Guzmán wrote:

> Hi all,
>  
> I was wondering if anyone knows how to get every possible article from online newspapers and magazines. I was thinking something like giving a program the URL of the newspaper (e.g. www.eltiempo.com (http://www.eltiempo.com)) and getting the text from all pages therein. Is that possible?
>  
> Thanks a lot,
>  
> Matías
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no (mailto:Corpora at uib.no)
> http://mailman.uib.no/listinfo/corpora
>  
>  


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20121129/3121184c/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list