<html>
<head>
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#663300">
<font face="Cambria">Hi </font>Matías <br>
<br>
which languages and domains are you looking for and what sizes? and
are you looking for monolingual data?<br>
ELRA regularly collects such data (after negotiating the rights), we
may have something to share with you.<br>
Best regards<br>
Khalid<br>
<br>
Matías Guzmán wrote, On 29/11/2012 19:21:
<blockquote
cite="mid:CAKrYe9=JKKT-LUKixVazjSjSgr5LQT=OLMhns9CY+U1Hkn5ycw@mail.gmail.com"
type="cite">
<pre wrap="">Hi all,
I was wondering if anyone knows how to get every possible article from
online newspapers and magazines. I was thinking something like giving a
program the URL of the newspaper (e.g. <a class="moz-txt-link-abbreviated" href="http://www.eltiempo.com">www.eltiempo.com</a>) and getting the
text from all pages therein. Is that possible?
Thanks a lot,
Matías
</pre>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
UNSUBSCRIBE from this page: <a class="moz-txt-link-freetext" href="http://mailman.uib.no/options/corpora">http://mailman.uib.no/options/corpora</a>
Corpora mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Corpora@uib.no">Corpora@uib.no</a>
<a class="moz-txt-link-freetext" href="http://mailman.uib.no/listinfo/corpora">http://mailman.uib.no/listinfo/corpora</a>
</pre>
</blockquote>
<br>
<div class="moz-signature">-- <br>
<b> Khalid Choukri </b>
<br>
ELRA General secretary & ELDA CEO
<br>
email: <a class="moz-txt-link-abbreviated" href="mailto:choukri@elda.org">choukri@elda.org</a>; <br>
Web: <a class="moz-txt-link-abbreviated" href="http://www.elra.info">www.elra.info</a> <a class="moz-txt-link-abbreviated" href="http://www.elda.org">www.elda.org</a>
<br>
Tel. +33 1 43 13 33 33 - Fax. +33 1 43 13 33 30
<br>
<br>
<b> ***************************************************<br>
** Info on LREC 2012 : <a class="moz-txt-link-abbreviated" href="http://www.lrec-conf.org">www.lrec-conf.org</a> <br>
***************************************************<br>
</b></div>
</body>
</html>