<html>
<head>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
<meta name=Generator content="Microsoft Word 10 (filtered)">
<style>
<!--
/* Font Definitions */
@font-face
{font-family:Wingdings;
panose-1:5 0 0 0 0 0 0 0 0 0;}
@font-face
{font-family:SimSun;
panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
{font-family:"\@SimSun";
panose-1:2 1 6 0 3 1 1 1 1 1;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman";}
a:link, span.MsoHyperlink
{color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{color:purple;
text-decoration:underline;}
span.EmailStyle17
{font-family:Arial;
color:windowtext;}
@page Section1
{size:612.0pt 792.0pt;
margin:72.0pt 90.0pt 72.0pt 90.0pt;}
div.Section1
{page:Section1;}
/* List Definitions */
ol
{margin-bottom:0cm;}
ul
{margin-bottom:0cm;}
-->
</style>
</head>
<body lang=EN-GB link=blue vlink=purple>
<div class=Section1>
<p class=MsoNormal><font size=3 color=darkblue face=Arial><span lang=EN-US
style='font-size:12.0pt;font-family:Arial;color:darkblue'>CLEANEVAL is a shared
task and competitive evaluation for cleaning arbitrary web pages, with the goal
of preparing web data for use as a corpus, for linguistic and language
technology research and development. You are invited to participate, and to
encourage others to do so too.<br>
<br>
</span></font><b><font color=red face=Arial><span lang=EN-US style='font-family:
Arial;color:red;font-weight:bold'><a
href="file:///C:\Documents%20and%20Settings\Adam\My%20Documents\Academic\CLEANEVAL\devset.html">Development
dataset now available</a></span></font></b><font color=darkblue face=Arial><span
lang=EN-US style='font-family:Arial;color:darkblue'>. </span></font></p>
<ul type=disc>
<li class=MsoNormal style='color:darkblue'><b><font size=3 color=red
face=Arial><span lang=EN-US style='font-size:12.0pt;font-family:Arial;
color:red;font-weight:bold'>Prizes!</span></font></b><font face=Arial><span
lang=EN-US style='font-family:Arial'> A prize of £250.00 (GBP) will be
awarded for the best student entrant for each task (Chinese and English). </span></font></li>
<li class=MsoNormal style='color:darkblue'><font size=3 color=darkblue
face=Arial><span lang=EN-US style='font-size:12.0pt;font-family:Arial'>Fuller
description <a
href="file:///C:\Documents%20and%20Settings\Adam\My%20Documents\Academic\CLEANEVAL\cleaneval-overview.html">http://cleaneval.sigwac.org.uk/cleaneval-overview.html</a>.
</span></font></li>
<li class=MsoNormal style='color:darkblue'><font size=3 color=darkblue
face=Arial><span lang=EN-US style='font-size:12.0pt;font-family:Arial'>Timetable:
</span></font></li>
</ul>
<div class=MsoNormal align=center style='margin-left:36.0pt;text-align:center'><font
size=3 color=darkblue face=Arial><span lang=EN-US style='font-size:12.0pt;
font-family:Arial;color:darkblue'>
<hr size=2 width="100%" align=center>
</span></font></div>
<font size=3 color=darkblue face=Arial><span lang=EN-US style='font-size:12.0pt;
font-family:Arial;color:darkblue'>
<ul type=disc>
<ul type=circle>
<li class=MsoNormal style='color:darkblue'><b><i><font face=Arial><span
style='font-family:Arial;font-weight:bold;font-style:italic'>March 2007:</span></font></i></b><font
face=Arial><span style='font-family:Arial'> Development datasets released
(English and Chinese) </span></font></li>
<li class=MsoNormal style='color:darkblue'><b><i><font face=Arial><span
style='font-family:Arial;font-weight:bold;font-style:italic'>June 2007:</span></font></i></b><font
face=Arial><span style='font-family:Arial'> Exercise: Evaluation dataset
released and, two weeks later, participants to return cleaned pages </span></font></li>
<li class=MsoNormal style='color:darkblue'><b><i><font face=Arial><span
style='font-family:Arial;font-weight:bold;font-style:italic'>end June
2007:</span></font></i></b><font face=Arial><span style='font-family:
Arial'> Papers describing systems to be submitted </span></font></li>
<li class=MsoNormal style='color:darkblue'><b><i><font
face=Arial><span style='font-family:Arial;font-weight:bold;font-style:
italic'>Sept 15-16 2007</span></font></i></b><b><i><font face=Arial><span
style='font-family:Arial;font-weight:bold;font-style:italic'>:</span></font></i></b><font
face=Arial><span style='font-family:Arial'> Workshop, part of WAC3, </span></font><font
face=Arial><span style='font-family:Arial'>Louvain</span></font><font
face=Arial><span style='font-family:Arial'>-la-Neuve, </span></font><font
face=Arial><span style='font-family:Arial'>Belgium</span></font><font
face=Arial><span style='font-family:Arial'> <b><span style='font-weight:
bold'><a href="http://cental.fltr.ucl.ac.be/wac3/">http://cental.fltr.ucl.ac.be/wac3/</a>
</span></b></span></font></li>
</ul>
</ul>
</span></font>
<div class=MsoNormal align=center style='margin-left:36.0pt;text-align:center'><b><font
size=3 color=darkblue face=Arial><span lang=EN-US style='font-size:12.0pt;
font-family:Arial;color:darkblue;font-weight:bold'>
<hr size=2 width="100%" align=center>
</span></font></b></div>
<ul type=disc>
<li class=MsoNormal style='color:darkblue'><font size=3 color=darkblue
face=Arial><span lang=EN-US style='font-size:12.0pt;font-family:Arial'>Annotation
guidelines <a
href="file:///C:\Documents%20and%20Settings\Adam\My%20Documents\Academic\CLEANEVAL\annotation_guidelines.html">http://cleaneval.sigwac.org.uk/annotation_guidelines.html</a>.
</span></font></li>
<li class=MsoNormal style='color:darkblue'><b><font size=3 color=darkblue
face=Arial><span lang=EN-US style='font-size:12.0pt;font-family:Arial;
font-weight:bold'>Co-ordinators</span></font></b><font face=Arial><span
lang=EN-US style='font-family:Arial'> </span></font></li>
<ul type=circle>
<li class=MsoNormal style='color:darkblue'><font size=3 color=darkblue
face=Arial><span lang=EN-US style='font-size:12.0pt;font-family:Arial'><a
href="http://sslmit.unibo.it/~baroni/">Marco Baroni</a>, </span></font><font
face=Arial><span lang=EN-US style='font-family:Arial'>Trento University</span></font><font
face=Arial><span lang=EN-US style='font-family:Arial'>, </span></font><font
face=Arial><span lang=EN-US style='font-family:Arial'>Italy</span></font><font
face=Arial><span lang=EN-US style='font-family:Arial'> </span></font></li>
<li class=MsoNormal style='color:darkblue'><font size=3 color=darkblue
face=Arial><span lang=EN-US style='font-size:12.0pt;font-family:Arial'><a
href="http://www.leeds.ac.uk/cts/staff/tony_hartley.htm">Tony Hartley</a>,
</span></font><font face=Arial><span lang=EN-US style='font-family:Arial'>Leeds
University</span></font><font face=Arial><span lang=EN-US
style='font-family:Arial'>, </span></font><font face=Arial><span
lang=EN-US style='font-family:Arial'>UK</span></font><font face=Arial><span
lang=EN-US style='font-family:Arial'> </span></font></li>
<li class=MsoNormal style='color:darkblue'><font size=3 color=darkblue
face=Arial><span lang=EN-US style='font-size:12.0pt;font-family:Arial'><a
href="http://www.kilgarriff.co.uk">Adam Kilgarriff</a>, Lexical Computing
Ltd., </span></font><font face=Arial><span lang=EN-US style='font-family:
Arial'>Leeds</span></font><font face=Arial><span lang=EN-US
style='font-family:Arial'> and </span></font><font face=Arial><span
lang=EN-US style='font-family:Arial'>Sussex Universities</span></font><font
face=Arial><span lang=EN-US style='font-family:Arial'>, </span></font><font
face=Arial><span lang=EN-US style='font-family:Arial'>UK</span></font><font
face=Arial><span lang=EN-US style='font-family:Arial'> </span></font></li>
<li class=MsoNormal style='color:darkblue'><font size=3 color=darkblue
face=Arial><span lang=EN-US style='font-size:12.0pt;font-family:Arial'><a
href="http://www.comp.leeds.ac.uk/ssharoff/">Serge Sharoff</a>, </span></font><font
face=Arial><span lang=EN-US style='font-family:Arial'>Leeds University</span></font><font
face=Arial><span lang=EN-US style='font-family:Arial'>, </span></font><font
face=Arial><span lang=EN-US style='font-family:Arial'>UK</span></font><font
face=Arial><span lang=EN-US style='font-family:Arial'> </span></font></li>
</ul>
</ul>
<p class=MsoNormal><font size=3 color=darkblue face=Arial><span lang=EN-US
style='font-size:12.0pt;font-family:Arial;color:darkblue'> </span></font></p>
<p class=MsoNormal><font size=3 color=darkblue face=Arial><span lang=EN-US
style='font-size:12.0pt;font-family:Arial;color:darkblue'>CLEANEVAL is an
activity of <a href="http://sigwac.org.uk">ACL-SIGWAC</a>, the <a
href="http://www.aclweb.org">Association for Computational Linguistics (ACL)</a>
Special Interest Group on Web as Corpus.</span></font></p>
<p class=MsoNormal><font size=3 face="Times New Roman"><span lang=EN-US
style='font-size:12.0pt'> </span></font></p>
</div>
</body>
</html>