[Corpora-List] CLEANEVAL competition now open: 14 June-13 July
Adam Kilgarriff
adam at lexmasterclass.com
Thu Jun 14 14:38:42 UTC 2007
***************
CLEANEVAL
***************
http://cleaneval.sigwac.org.uk
<http://cleaneval.sigwac.org.uk/>
*****Competition now open: 14 June - 13 July 2007 *****
CLEANEVAL is a shared task and competitive evaluation for cleaning arbitrary
web pages, with the goal of preparing web data for use as a corpus, for
linguistic and language technology research and development. We invite you
to participate and to encourage others to do so too.
PRIZES for the best student entrants.
Results will be presented and discussed at the WAC3 workshop,
Louvain-la-Neuve,
Belgium, 15-16 Sept 2007: http://cental.fltr.ucl.ac.be/wac3/
The Cleaneval competition is now open. (NB: competition dates have been
extended since the last announcement.)
To get the data please email cleaneval at sigwac.org.uk with
- subject line: Cleaneval data request
- text (cut and paste from this email, and complete):
Name(s):
Affiliation:
Country you work in:
Contact email:
Short name for participating system:
Student yes/no
Participating for
Chinese: (yes/no)
English: (yes/no)
We shall then send you URL, username and password for downloading the data.
(Both English and Chinese are on the same website.)
The data is to be returned within 48 hours of us sending you this mail. We
shall be asking you to return it by sending email to cleaneval at sigwac.org.uk
with
- subject line: Cleaneval answers
- text:
* system name (corresponding to the system name you gave us when
registering)
* URL for us to download your results data (if this presents difficulties
please email us)
* the following declaration (please use cut-and-paste)
"We confirm that we have used a fully automatic system to process the pages
and that we have not manually edited output and that we have not used the
evaluation set to improve the performance of the algorithm in any way."
These details and further discussion available at the CLEANEVAL website,
http://cleaneval.sigwac.org.uk <http://cleaneval.sigwac.org.uk/>
CLEANEVAL is an activity of the ACL Special Interest Group on Web as Corpus,
ACL-SIGWAC http://sigwac.org.uk <http://sigwac.org.uk/>
The CLEANEVAL Organizers
Marco Baroni
Adam Kilgarriff
Serge Sharoff
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20070614/ee22236f/attachment.htm>
More information about the Corpora
mailing list