[Corpora-List] CFP - Semeval-2013 Task #11: Word Sense Induction & Disambiguation within an End-User Application

Roberto Navigli navigli at di.uniroma1.it
Thu Jan 17 10:54:34 UTC 2013


                              **** CALL FOR PARTICIPATION ****

        Word Sense Induction & Disambiguation within an End-User Application
                                      SemEval 2013 - Task #11

                      http://www.cs.york.ac.uk/semeval-2013/task11/


The aim of this task is to provide a framework for the objective evaluation
and comparison of Word Sense Disambiguation and Induction algorithms in an
end-user application, namely Web Search Result Clustering.


INTRODUCTION
-------------
The proposed application is Web Search Result Clustering, a task consisting
of grouping into clusters the snippet results returned by a search engine
for an input query. Results in a given cluster are assumed to be
semantically related to each other and each cluster is expected to
represent a specific meaning of the input query.

A Word Sense Induction (WSI) system will be asked to identify the meaning
of the input query and cluster the snippets into semantically-related
groups according to their meanings. Instead, a Word Sense Disambiguation
(WSD) system will be requested to sense-tag the above snippets with the
appropriate senses of the input query and this, again, will implicitly
result in a clustering of snippets (i.e., one cluster per sense).
WSD and WSI systems will then be evaluated in an end-user application,
i.e., according to their ability to diversify the search results for the
input query. This evaluation scheme, previously proposed for WSI by Navigli
and Crisafulli (2010) and Di Marco and Navigli (2013), is extended here to
WSD and WSI systems and is aimed at overcoming the limitations of in vitro
evaluations. In fact, the quality of the output clusters will be assessed
in terms of their ability to diversify the snippets across the query
meanings.

No training data will be provided.


DATASET CREATION
-------------
We will release new test data for this task. The test data will be created
by:

- Manually selecting ambiguous queries of different lengths;
- Querying Google;
- Retrieving the top 64 results for each query;
- Associating each resulting snippet with the most appropriate Wikipedia
sense (i.e., page) for that query. The annotations will be obtained by
crowdsourcing+further checks by the authors.


NEWS
-------------
*** An evaluation tool is now available from the task website! ***


IMPORTANT DATES
-------------
February 15, 2013 - Registration Deadline [for Task Participants]
March 1, 2013 onwards - Start of evaluation period [Task Dependent]
March 15, 2013 - End of evaluation period
April 9, 2013 - Paper submission deadline [TBC]
April 23, 2013 - Reviews Due [TBC]
May 4, 2013 - Camera ready Due [TBC]


MORE INFORMATION
-------------
The Semeval-2013 Task #11 website, for signup and details, is:

   http://www.cs.york.ac.uk/semeval-2013/task11/

If interested in the task please join our mailing list for updates:

   http://groups.google.com/group/semeval-2013-wsi-in-application


ORGANIZERS
-------------
Roberto Navigli (lastname at di.uniroma1.it), Sapienza University of Rome,
Italy
Daniele Vannella (lastname at di.uniroma1.it), Sapienza University of Rome,
Italy


REFERENCES
-------------
R. Navigli, G. Crisafulli. Inducing Word Senses to Improve Web Search
Result Clustering. Proc. of EMNLP 2010, Massachusets, USA, pp. 116-126,
2010.
A. Di Marco, R. Navigli. Clustering and Diversifying Web Search Results
with Graph-Based Word Sense Induction. Computational Linguistics, 39(4),
MIT Press, 2013.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20130117/9f5ad625/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list