<table cellpadding=3 cellspacing=0 border=0 width=100% bgcolor=white><tr valign=top><td width=100%><font size=2 color=black>Dear Serge,<br> <br>Thank you for you answer and kind offer.<br>As for your suggestion of using Wacky, our problem is not so much that of obtaining "newswire" text from the web - because we could in fact obtain that text from the publicly available 14Gb collection of the portuguese web, the WTP03 (please see http://poloxldb.linguateca.pt/index.php?l=WPT_03 ) by using a similar procedure to the one you mentioned, since it is all indexed in a MySQL database - but instead that of obtaining a newswire collection that is manually classified by topic/domain and comparable to the english one. :)<br>I was wandering that there could in fact be one such a collection available, since Reuters is a global news agency and I am sure that they produce a huge number of newswire texts everyday in several languages. <br>Best,<br> <br>LS<BR><BR><BR><BR><BR><BR>--- On Wed
11/16, Serge Sharoff < s.sharoff@leeds.ac.uk > wrote:<BR><br><BLOCKQUOTE style="PADDING-LEFT: 7px; MARGIN-LEFT: 7px; BORDER-LEFT: orange 2px solid"><B>From: </B>Serge Sharoff [mailto: s.sharoff@leeds.ac.uk]<BR><B>To: </B>parapraxe@excite.com<BR><B>Cc: </B>corpora@hd.uib.no<BR><B>Date: </B>Wed, 16 Nov 2005 10:57:39 +0000<BR><B>Subject: </B>Re: [Corpora-List] REUTER corpus online?<BR><BR>Luis,<BR><BR>we have an online interface to the Reuters corpus (indexed by<BR>CorpusWorkbench). It's available from:<BR>http://corpus.leeds.ac.uk/<BR><BR>Because of the agreement with Reuters the access is mostly limited to<BR>inhouse research. However, we can provide a password for<BR>research-related concordancing.<BR><BR>As for Portuguese, if you have a reasonable list of words frequent in<BR>Portuguese newswires and a tagger/lemmatiser, a corpus like this can be<BR>collected from the web. See the Wacky initiative:<BR>http://wacky.sslmit.unibo.it/<BR><BR>Best wishes,<BR>Serge<BR><BR>On
Tue, 2005-11-15 at 10:45 -0500, Luis Sarmento wrote:<BR>> Dear Corpora-List members,<BR>> <BR>> <BR>> <BR>> Does anyone know if there is any publicly available online version of<BR>> the reuters corpus? In other words, is there any web concordace tool<BR>> (free) for the Reuters Corpus?<BR>> <BR>> Btw, I wonder if there are comparable versions of the reuters corpus<BR>> available, namely in Portuguese, for bilingual studies. Is anyone<BR>> using "comparable" version of reuters in Portuguese?<BR>> <BR>> Thanks to all,<BR>> <BR>> <BR>> <BR>> Lus Sarmento<BR>> <BR>> <BR>> <BR>> <BR>> <BR><BR>-- <BR>Dr. Serge Sharoff<BR>Centre for Translation Studies<BR>School of Modern Languages and Cultures<BR>University of Leeds<BR>Leeds, LS2 9JT<BR><BR>tel: +44(0)113 343 7287<BR>fax: +44(0)113 343 3287<BR><BR></BLOCKQUOTE></font></td></tr></table><p><hr><font size=2 face=geneva><b>Join Excite! - <a href=http://www.excite.com
target=_blank>http://www.excite.com</a></b><br>The most personalized portal on the Web!</font>