<html><body><div style="color:#000; background-color:#fff; font-family:HelveticaNeue, Helvetica Neue, Helvetica, Arial, Lucida Grande, sans-serif;font-size:10pt"><div><span><br></span></div><div class="yahoo_quoted" style="display: block; "> *<span style="font-family: 'times new roman', 'new york', times, serif; font-size: 16px; ">Apologies for cross-posting*</span></div><div><span style="font-family: 'times new roman', 'new york', times, serif; font-size: 16px;"><br></span><br><div style="font-family: HelveticaNeue, 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; "><div style="font-family: HelveticaNeue, 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; "><div dir="ltr" style="font-size: 12pt; "><span style="font-size: small;"> </span></div> <div class="y_msg_container"><span style="font-size: 13px; ">=======================================================================<br>
Entity Extraction and Linking Challenge<br> at the 4th Making Sense of Microposts Workshop<br> (#Microposts2014) @ WWW 2014<br> <a href="http://www.scc.lancs.ac.uk/microposts2014/challenge/index.html" target="_blank">http://www.scc.lancs.ac.uk/microposts2014/challenge/index.html</a><br> 7 April 2014, Seoul, Republic of Korea<br>=======================================================================<br><br>Microposts are a highly popular medium to share facts, opinions or <br>emotions. They are an invaluable wealth of data, ready to be mined for <br>training predictive modelings. This year the #Microposts 2014 Workshop <br>will host an "Entity Extraction and Linking Challenge".<br>The overall task of the challenge is to automatically extract entities <br>from English microposts, and
link them to the corresponding English <br>DBpedia v3.9 resources (if the linkage exists). As linking stage we aim <br>to disambiguate expressions that are formed by discrete (and typically short) sequences <br>of words.<br>Existing entity linking tools are intended for use over news corpora and <br>similar document-based corpora with relatively long length. We organise <br>this challenge to foster research into novel, more accurate solutions <br>for the automatic entity linking in (much shorter) micropost data.<br>We will ask the participants to automatically extract entities (e.g., <br>Obama, London, Rakuten) belonging to all entity types (e.g., Person, <br>Location, Organisation) from a collection of microposts. Participants <br>will have to automatically provide context-relevant DBpedia resources <br>for each entity in a micropost.<br><br>DATASET<br>-------<br>The dataset comprises of 3.5K tweets extracted from a much larger
<br>collection of over 18 million tweets. This collection, provided by the <br>Redites project (<a href="http://demeter.inf.ed.ac.uk/redites/" target="_blank">http://demeter.inf.ed.ac.uk/redites/</a>), covers <br>event-annotated tweets collected for the period of 15th July 2011 to <br>15th August 2011 (31 days). It extends over multiple noteworthy events <br>including the death of Amy Winhehouse, the London Riots and the Oslo <br>bombing. Since the task of this challenge is to automatically extract <br>and link entities, we have built our dataset considering both event and <br>non-event tweets. While event tweets are more likely to contain <br>entities, non-event tweets enable us to evaluate the performance of the <br>system in avoiding false positives in the entity extraction phase.<br><br>The dataset has been split into a training (70%) and testing (30%) sets. <br>Following the Twitter TOS we will only provide tweet IDs and annotations <br>for the
training set; and tweet IDs for the test set. We will also <br>provide a common framework to mine these datasets from Twitter.<br><br>The training set will be released as tsv file where each line consists of :<br>- tweet_id<br>- entity_mention_1<br>- entity_uri_1<br>…<br>- entity_mention_n<br>- entity_uri_n<br>Tokens are separated by TABs. Entity mentions and uris are listed <br>according to their appearance order in the tweet.<br><br>We will timely advertise the release of the data sets on the workshop <br>mailing list. Please subscribe to <br><a href="https://groups.google.com/d/forum/microposts2014." target="_blank">https://groups.google.com/d/forum/microposts2014. </a>More information about <br>dates are available in the Challenge website.<br><br>EVALUATION<br>----------<br>The evaluation consists of two separated stages:<br>1.- Paper peer review : A community of experts of the domain will judge <br>the quality and applicability of the approaches
taken, to provide useful <br>insights on your research;<br>2.- Precision and Recall: F1 (F-measure with beta = 1) will be computed <br>on a gold standard manually created from the test set. The automatically <br>extracted entities and links will be both matched against this ground truth.<br><br>All submissions will be only ranked according to the F1 of each best <br>submission.<br><br>SUBMISSIONS<br>-----------<br>Submissions should be provided as a zip file using your system name as <br>the file name (e.g. 'awesome.zip'), containing:<br><br>1. a TSV file with your system name (e.g. 'awesome.tsv'). We accept up <br>to 3 different submissions, and we will consider *only* the best. If you <br>do so you must specify clearly in your paper the modifications applied <br>to each labelled submission. In this case the submission should contain <br>each of up to 3 TSV files with the tool/system name with "_n" appended <br>to each (e.g. awesome_1.tsv,
awesome_2.tsv, awesome_3 ).<br>In order to evaluate your submissions we require you to submit a tsv <br>file following the format in which the training set is provided.<br><br>2. a paper of 6 pages describing your approach and how you tuned/tested <br>it using the training split. All submissions must be in English. <br>Submissions must be in PDF formatted in the style of the Springer <br>Publications format for Lecture Notes in Computer Science (LNCS) <br>[<a href="http://www.springer.com/computer/lncs?SGWID=0-164-6-793341-0" target="_blank">http://www.springer.com/computer/lncs?SGWID=0-164-6-793341-0</a>]. For <br>details on the LNCS style, see Springer’s Author Instructions. All <br>submissions are not anonymous. Please send us your submission before the <br>deadline through Easychair <br>[<a href="https://www.easychair.org/conferences/?conf=microposts2014" target="_blank">https://www.easychair.org/conferences/?conf=microposts2014</a>]. All
<br>accepted submissions will be invited for short presentations during the <br>#Microposts2014 workshop and will be published independently from the <br>workshop proceedings on the challenge page and on CEUR <br>[<a href="http://ceur-ws.org/" target="_blank">http://ceur-ws.org/</a>] (note that a minimum number of papers should be <br>submitted in order to be able to publish them on CEUR).<br><br>IMPORTANT DATES<br>---------------<br>Intent to participate: 13 Jan 2014 (soft)<br>Release of training set: 14 Jan 2014<br>Release of test set: 17 Feb 2014<br>Challenge Submission deadline: 21 Feb 2014 (hard)<br>Challenge Notification: 14 Mar 2014 (hard)<br>Challenge camera-ready deadline: 24 Mar 2014 (hard)<br><br>Workshop program issued: 15 Mar 2014<br>Challenge proceedings to be published via CEUR<br>Workshop - 07 Apr 2014 (Registration open to all)<br>(All deadlines 23:59 Hawaii Time)<br><br>PRIZE<br>-----<br>to be
announced<br><br>CONTACT<br>-------<br>E-mail: <a ymailto="mailto:microposts2014@easychair.org" href="mailto:microposts2014@easychair.org">microposts2014@easychair.org</a><br>Facebook Group: <a href="http://www.facebook.com/#!/home.php?sk=group_180472611974910" target="_blank">http://www.facebook.com/#!/home.php?sk=group_180472611974910</a><br>Facebook Public Event page: <a href="http://www.facebook.com/events/116134955169543" target="_blank">http://www.facebook.com/events/116134955169543</a><br>Google group : <a href="https://groups.google.com/forum/#!forum/microposts2014" target="_blank">https://groups.google.com/forum/#!forum/microposts2014</a><br>Twitter hashtag: #microposts2014challenge<br>Twitter account: @Microposts2014<br>W3C Microposts Community Group: <a href="http://www.w3.org/community/microposts" target="_blank">http://www.w3.org/community/microposts</a><br><br>Challenge Organizers:<br>------------------------------<br>Challenge
Chair:<br>A. Elizabeth Cano, Aston University, UK<br>Giuseppe Rizzo, Università di Torino, Italy<br><br>Dataset Chair:<br>Andrea Varga, The University of Sheffield, UK<br><br>Challenge Committee:<br>---------------------<br>Ebrahim Bagheri, Ryerson University, Canada<br>Pierpaolo Basile, Dipartimento di Informatica - University of Bari, Italy<br>Uldis Bojars, SIOC Project<br>Óscar Corcho, Universidad Politécnica de Madrid, Spain<br>Leon Derczynski, The University of Sheffield, UK<br>Guillaume Erétéo, Orange Labs<br>Miriam Fernandez, Knowledge Media Institute, The Open University, UK<br>Andrés García-Silva, Ontology Engineering Group, Facultad de <br>Informática, Univesidad Politécnica de Madrid, Spain<br>Anna Lisa Gentile, The University of Sheffield, UK<br>Robert Jäschke, L3S Research Center, Germany<br>Diana Maynard, The University of Sheffield, UK<br>José M. Morales-Del-Castillo, El Colegio de México,
Mexico<br>Georgios Paltoglou, University of Wolverhampton, UK<br>Bernardo Pereira Nunes, PUC-Rio, Brazil<br>Daniel Preoţiuc-Pietro, The University of Sheffield, UK<br>Raphaël Troncy, EURECOM, France<br>Mischa Tuffield, PeerIndex<br>Victoria Uren, Aston University, UK<br><br><br><br></span><br></div> </div> </div> </div> </div></body></html>