Congratulations!<br>I would also like to mention that we have developed a Norwegian web corpus: <span class="styleoverskrift">NoWaC v 1.0</span>.<br><span class="stylebrodtekst">The computational procedure used to collect
the NoWaC corpus is largely based on the techniques used to build the corpora published by <a href="http://wacky.sslmit.unibo.it/" target="_blank">the WaCky initiative</a>. </span>The NoWaC corpus was developed by Emiliano Guevara.<br>
<br>Search the corpus: <a href="http://www.tekstlab.uio.no/nowac/">http://www.tekstlab.uio.no/nowac/</a><br>Read about it here: <span class="styleoverskrift"></span><a href="http://www.hf.uio.no/tekstlab/nowac.html">http://www.hf.uio.no/tekstlab/nowac.html</a><br>
<br>It will be properly announced later.<br><br>Best,<br>Janne Bondi Johannessen.<br><br><div class="gmail_quote">2010/4/8 Adriano Ferraresi <span dir="ltr"><<a href="mailto:adriano@sslmit.unibo.it">adriano@sslmit.unibo.it</a>></span><br>
<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"><div style="word-wrap: break-word;">Dear corpora members,<br><br>we are happy to announce that we've recently completed work on frWaC, a new corpus resource for French. <br>
<br>Like deWaC (for German), itWaC (for Italian) and ukWaC (for English), frWaC is a mega-corpus (~ 1.6 billion words) obtained by crawling and post-proccesing Web data. It is available both in a plain text version, and in an annotated version, which includes Part-of-Speech and lemma information. An earlier version of the corpus, and the procedure for its construction, are described here:<br>
<br>Ferraresi, A., S. Bernardini, G. Picci and M. Baroni (2010) “Web Corpora for Bilingual Lexicography: A Pilot Study of English/French Collocation Extraction and Translation”. In Xiao, R. (ed.) Using Corpora in Contrastive and Translation Studies. Newcastle: Cambridge Scholars Publishing.<br>
<br>For more details on the corpus and how to obtain it, please visit the WaCky project website:<br><br><a href="http://wacky.sslmit.unibo.it/" target="_blank">http://wacky.sslmit.unibo.it/</a> <br><br>Best,<br><br>The WaCkies </div>
<br>_______________________________________________<br>
Corpora mailing list<br>
<a href="mailto:Corpora@uib.no">Corpora@uib.no</a><br>
<a href="http://mailman.uib.no/listinfo/corpora" target="_blank">http://mailman.uib.no/listinfo/corpora</a><br>
<br></blockquote></div><br><br clear="all"><br>-- <br>Janne Bondi Johannessen<br>Professor, The Text Laboratory, ILN, <a href="http://www.hf.uio.no/tekstlab/">http://www.hf.uio.no/tekstlab/</a><br>President, NEALT, <a href="http://omilia.uio.no/nealt/">http://omilia.uio.no/nealt/</a><br>
University of Oslo<br>P.O.Box 1102 Blindern, N-0317 Oslo, Norway<br>Tel: +47 22 85 68 14, mob.: +47 928 966 34<br><br>