[Corpora-List] Call for Participation: Multilingual Word Sense Disambiguation (SemEval 2013 - Task 12)

David Jurgens david.jurgens at gmail.com
Mon Feb 4 22:53:47 UTC 2013


*

Call For Participation


Multilingual Word Sense Disambiguation
SemEval 2013 - Task #12

http://www.cs.york.ac.uk/semeval-2013/task12/

The aim of this task is to evaluate Word Sense Disambiguation systems in an
all-words multilingual setting.


INTRODUCTION

Task 12 provides a traditional setup for evaluating Word Sense
Disambiguation (WSD) systems in an all-words, multilingual setting by
marking occurrences of potentially polysemous words in five different
languages (English, French, German, Italian, Spanish) with sense labels
provided by a multilingual sense inventory.  To enable multilinguality we
make use of the BabelNet sense inventory [1], a wide-coverage semantic
network built by merging WordNet with Wikipedia to provide an “encyclopedic
dictionary.”  BabelNet concepts are lexicalized in many languages using
Wikipedia’s inter-language links and the output of a state-of-the-art
machine translation system.   Task 12 will use a validated version of
BabelNet 1.1 (http://babelnet.org) in which the Wikipedia-WordNet mappings
of all senses of lemmas in the test data have been manually verified.

PARTICIPANT SENSE INVENTORY

Participants are free to work on the full BabelNet sense inventory or to
work on either of its inventory subsets, i.e. WordNet 3.0 or Wikipedia page
titles. They are also free to participate using a single language of their
choice or all five languages.

TASK DETAILS:

Following the traditional WSD “all-words” experimental setting [2], systems
will be expected to link all occurrences of noun phrases within arbitrary
texts in different languages to the most suitable senses in the sense
inventory of their choice. For instance, given the sentence:


   1. The dramatic force of Miller's play derives in part from
   expressionistic techniques he used to portray Loman's psychological anguish
   and guilt-ridden fantasy life.


a disambiguation system should link “Miller” to any of (1) the BabelNet
synset for Arthur
Miller<http://lcl.uniroma1.it/babelnet/search.jsp?word=Arthur+Miller&lang=EN>,
(2) the Wikipedia sense corresponding to the page
http://en.wikipedia.org/wiki/Arthur_Miller, or (3) Miller#n#3 (i.e. the
third WordNet sense for Miller), depending on the participant’s choice of
sense inventory.  Note that the BabelNet synset will contain where
applicable both the Wikipedia page and the WordNet synset in its
representation.

Participants will be evaluated in groups based on their choice of sense
inventory and target language. All the information about the submitted
systems (such as training data, resources, etc. used by the system) will be
reported in the task paper.

DATASETS:

No training data will be provided as a part of this task; however,
participants are allowed to use any freely available training data for
building their system.

For annotating the test set, by mid-February we will provide a gold
standard version of BabelNet 1.1 where all synsets used in the test data
have been manually verified for correctness.


IMPORTANT DATES

February 15, 2013 - Registration Deadline
March 1, 2013 onwards - Start of evaluation period
March 15, 2013 - End of evaluation period
April 9, 2013 - Paper submission deadline [TBC]
April 23, 2013 - Reviews Due [TBC]
May 4, 2013 - Camera ready Due [TBC]


MORE INFORMATION

The Semeval-2013 Task #12 website, for signup and details, is:

  http://www.cs.york.ac.uk/semeval-2013/task12/

If interested in the task please join our mailing list for updates:

  http://groups.google.com/group/semeval13-multilingual-wsd/


ORGANIZERS
Roberto Navigli (lastname at di.uniroma1.it), Sapienza University of Rome,
Italy
David Jurgens (lastname at di.uniroma1.it), Sapienza University of Rome, Italy


REFERENCES
1. Roberto Navigli & Simone Paolo Ponzetto. BabelNet: The automatic
construction, evaluation and application of a wide-coverage multilingual
semantic network. Artificial Intelligence, 193, 2012, pp. 217-250.
2. Roberto Navigli. Word Sense Disambiguation: A survey. ACM Computing
Survey, 41(2), ACM Press, 2009, pp. 1-69.*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20130204/7033ee8b/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list