<br> **** CALL FOR PARTICIPATION ****<br><br> Word Sense Induction & Disambiguation within an End-User Application<br> SemEval 2013 - Task #11<br>
<br> <a href="http://www.cs.york.ac.uk/semeval-2013/task11/">http://www.cs.york.ac.uk/semeval-2013/task11/</a><br><br><br>The aim of this task is to provide a framework for the objective evaluation and comparison of Word Sense Disambiguation and Induction algorithms in an end-user application, namely Web Search Result Clustering.<br>
<br><br>INTRODUCTION<br>-------------<br>The proposed application is Web Search Result Clustering, a task consisting of grouping into clusters the snippet results returned by a search engine for an input query. Results in a given cluster are assumed to be semantically related to each other and each cluster is expected to represent a specific meaning of the input query.<br>
<br>A Word Sense Induction (WSI) system will be asked to identify the meaning of the input query and cluster the snippets into semantically-related groups according to their meanings. Instead, a Word Sense Disambiguation (WSD) system will be requested to sense-tag the above snippets with the appropriate senses of the input query and this, again, will implicitly result in a clustering of snippets (i.e., one cluster per sense).<br>
WSD and WSI systems will then be evaluated in an end-user application, i.e., according to their ability to diversify the search results for the input query. This evaluation scheme, previously proposed for WSI by Navigli and Crisafulli (2010) and Di Marco and Navigli (2013), is extended here to WSD and WSI systems and is aimed at overcoming the limitations of in vitro evaluations. In fact, the quality of the output clusters will be assessed in terms of their ability to diversify the snippets across the query meanings.<br>
<br>No training data will be provided. <br><br><br>DATASET CREATION<br>-------------<br>We will release new test data for this task. The test data will be created by:<br><br>- Manually selecting ambiguous queries of different lengths;<br>
- Querying Google;<br>- Retrieving the top 64 results for each query;<br>- Associating each resulting snippet with the most appropriate Wikipedia sense (i.e., page) for that query. The annotations will be obtained by crowdsourcing+further checks by the authors.<br>
<br><br>NEWS<br>-------------<br>*** An evaluation tool is now available from the task website! ***<br><br><br>IMPORTANT DATES<br>-------------<br>February 15, 2013 - Registration Deadline [for Task Participants]<br>March 1, 2013 onwards - Start of evaluation period [Task Dependent]<br>
March 15, 2013 - End of evaluation period<br>April 9, 2013 - Paper submission deadline [TBC]<br>April 23, 2013 - Reviews Due [TBC]<br>May 4, 2013 - Camera ready Due [TBC]<br><br><br>MORE INFORMATION<br>-------------<br>The Semeval-2013 Task #11 website, for signup and details, is:<br>
<br> <a href="http://www.cs.york.ac.uk/semeval-2013/task11/">http://www.cs.york.ac.uk/semeval-2013/task11/</a><br> <br>If interested in the task please join our mailing list for updates:<br><br> <a href="http://groups.google.com/group/semeval-2013-wsi-in-application">http://groups.google.com/group/semeval-2013-wsi-in-application</a><br>
<br><br>ORGANIZERS<br>-------------<br>Roberto Navigli (<a href="mailto:lastname@di.uniroma1.it">lastname@di.uniroma1.it</a>), Sapienza University of Rome, Italy<br>Daniele Vannella (<a href="mailto:lastname@di.uniroma1.it">lastname@di.uniroma1.it</a>), Sapienza University of Rome, Italy<br>
<br><br>REFERENCES<br>-------------<br>R. Navigli, G. Crisafulli. Inducing Word Senses to Improve Web Search Result Clustering. Proc. of EMNLP 2010, Massachusets, USA, pp. 116-126, 2010.<br>A. Di Marco, R. Navigli. Clustering and Diversifying Web Search Results with Graph-Based Word Sense Induction. Computational Linguistics, 39(4), MIT Press, 2013.<br>
<br>