Fw: [Corpora-List] Resend: CLEANEVAL Web-as-Corpus exercise

Senta Setinc senta.setinc at triera.net
Tue Apr 3 22:14:09 UTC 2007


----- Original Message ----- 
From: "Senta Setinc" <senta.setinc at triera.net>
To: "Adam Kilgarriff" <adam at lexmasterclass.com>
Sent: Tuesday, April 03, 2007 11:49 PM
Subject: Re: [Corpora-List] Resend: CLEANEVAL Web-as-Corpus exercise


> Forgive me for sending you this information, which is much, much less
> important than Adam's : For those who have experienced problems with
> overload on their harddisks, there is a wonderful new (not for all of you,
I
> am sure) cleaning software - a free tool, named CClenaer (Crap Cleaner).
You
> can download it from here:  http://www.ccleaner.com
> In only a matter of seconds I gained about 1,3 Giga Bytes of free space.
> Amazing, really.
>
> All the best to all, Senta
> ----- Original Message ----- 
> From: "Adam Kilgarriff" <adam at lexmasterclass.com>
> To: <sigwac at sslmit.unibo.it>; <corpora at hd.uib.no>
> Sent: Tuesday, April 03, 2007 6:56 PM
> Subject: [Corpora-List] Resend: CLEANEVAL Web-as-Corpus exercise
>
>
> **Apologies for faulty links in last version**
>
> CLEANEVAL is a shared task and competitive evaluation for cleaning
arbitrary
> web pages, with the goal of preparing web data for use as a corpus, for
> linguistic and language technology research and development.  You are
> invited to participate, and to encourage others to do so too.
>
> Website: http://cleaneval.sigwac.org.uk
>
> Development dataset now available.
>
> *  Prizes! A prize of £250.00 (GBP) will be awarded for the best
>       student entrant for each task (Chinese and English).
> *  Timetable:
>   * March 2007: Development datasets released (English and Chinese)
>   * June 2007: Exercise: Evaluation dataset released and, two weeks
>                  later, participants to return cleaned pages
>   * end June 2007: Papers describing systems to be submitted
>   * Sept 15-16 2007: Workshop, part of WAC3, Louvain-la-Neuve, Belgium
>       http://cental.fltr.ucl.ac.be/wac3/
>
> *  Co-ordinators
>   *  Marco Baroni, Trento University, Italy
>   *  Tony Hartley, Leeds University, UK
>   *  Adam Kilgarriff, Lexical Computing Ltd., Leeds and Sussex Univs, UK
>   *  Serge Sharoff, Leeds University, UK
>
> CLEANEVAL is an activity of ACL-SIGWAC, the Association for Computational
> Linguistics (ACL) Special Interest Group on Web as Corpus.
>
>
>
>



More information about the Corpora mailing list