[Corpora-List] Cross-document coreference/Entity Resolution: $50,000 Spock Challenge

Eric Atwell eric at comp.leeds.ac.uk
Thu Apr 19 07:50:09 UTC 2007


Thanks for telling us what is in the download file, without having to
download it!  - 97000 files (9Gb) of raw HTML, which contestants first
have to "clean" themselves before they can try any fancy NLP stuff.

A group of European reseachers from Trento and Leeds have launched 
CLEANEVAL, another contest to build tidy tools for web-as-corpus
research, see http://cleaneval.sigwac.org.uk/ - This could be a useful
first-step for anyone trying the spock challenge; also, any spock 
contestants could also enter their tidy-tool in the CLEANEVAL contest!

Eric Atwell, Leeds University


-----Original Message-----
> From: owner-corpora at lists.uib.no [mailto:owner-corpora at lists.uib.no] On
> Behalf Of Alexandre Rafalovitch
> Sent: Wednesday, April 18, 2007 10:07 PM
> To: CORPORA at uib.no
> Subject: Re: [Corpora-List] Cross-document coreference/Entity Resolution:
> $50,000 Spock Challenge
>
> The website is rather sparse on information at the moment, so I have
> downloaded their (rather large) corpora and had a look.
>
> If anyone is interested in the challenge, my overview might help you
> to make a decision better and faster:
> http://blog.outerthoughts.com/2007/04/spock-announces-an-entity-resolution-c
> ompetition/
>
> Hope it helps,
>   Alex.
>



More information about the Corpora mailing list