[Corpora-List] CLEANEVAL competition now open: 14 June-13 July

Adam Kilgarriff adam at lexmasterclass.com
Thu Jun 14 14:38:42 UTC 2007


                             ***************

                             CLEANEVAL

                             ***************

                 http://cleaneval.sigwac.org.uk
<http://cleaneval.sigwac.org.uk/>  

 

        *****Competition now open: 14 June - 13 July 2007 *****

 

CLEANEVAL is a shared task and competitive evaluation for cleaning arbitrary


web pages, with the goal of preparing web data for use as a corpus, for 

linguistic and language technology research and development. We invite you 

to participate and to encourage others to do so too. 

 

PRIZES for the best student entrants.

 

Results will be presented and discussed at the WAC3 workshop,
Louvain-la-Neuve,

Belgium, 15-16 Sept 2007: http://cental.fltr.ucl.ac.be/wac3/ 

 

The Cleaneval competition is now open.  (NB: competition dates have been
extended since the last announcement.)

 

To get the data please email cleaneval at sigwac.org.uk with

 

- subject line: Cleaneval data request

- text (cut and paste from this email, and complete): 

    Name(s): 

    Affiliation: 

    Country you work in: 

    Contact email: 

    Short name for participating system: 

    Student yes/no 

    Participating for 

      Chinese: (yes/no) 

      English: (yes/no) 

 

We shall then send you URL, username and password for downloading the data. 

(Both English and Chinese are on the same website.)

 

The data is to be returned within 48 hours of us sending you this mail.  We 

shall be asking you to return it by sending email to cleaneval at sigwac.org.uk


with

 

- subject line: Cleaneval answers

- text:

   * system name (corresponding to the system name you gave us when
registering)

   * URL for us to download your results data (if this presents difficulties

please email us)

   * the following declaration (please use cut-and-paste)

 

"We confirm that we have used a fully automatic system to process the pages 

and that we have not manually edited output and that we have not used the 

evaluation set to improve the performance of the algorithm in any way."

 

These details and further discussion available at the CLEANEVAL website,

http://cleaneval.sigwac.org.uk <http://cleaneval.sigwac.org.uk/>  

 

CLEANEVAL is an activity of the ACL Special Interest Group on Web as Corpus,


ACL-SIGWAC http://sigwac.org.uk <http://sigwac.org.uk/>  

 

The CLEANEVAL Organizers

  Marco Baroni

  Adam Kilgarriff

  Serge Sharoff

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20070614/ee22236f/attachment.htm>


More information about the Corpora mailing list