[Corpora-List] SemEval-2010 Shared Task #9: Noun Compound Interpretation Using Paraphrasing Verbs and Prepositions - Call for Participation

Preslav Nakov (GMail) preslavn at gmail.com
Wed Feb 17 10:21:30 UTC 2010


Apologies for multiple postings.

*******************************************************

            Call for Participation

         SemEval-2010 Shared Task #9:
         Noun Compound Interpretation
   Using Paraphrasing Verbs and Prepositions

http://docs.google.com/View?docid=dfvxd49s_35hkprbcpt

          --- Training data available ---

*******************************************************

This shared task should be of interest to researchers working on
 * semantic relation extraction
 * information extraction
 * lexical semantics
 * noun compound interpretation


============
 Background
============

A noun compound is a sequence of nouns acting as a single noun,
e.g., colon cancer, suppressor protein, colon cancer tumor suppressor
protein.
Noun compounds are both highly frequent and highly productive in English,
which means that achieving robust noun compound interpretation is an
important goal
for broad-coverage semantic processing. NLP systems cannot just ignore
compounds
without discarding valuable semantic information; at the same time, the only
way
to achieve broad coverage on compounds is to interpret them compositionally,
as it is impossible to list in a lexicon all compounds that are likely to be
encountered.

In this shared task, we explore the idea of interpreting the semantics of
noun compounds
using paraphrasing verbs and prepositions. For example, "nut bread" can be
paraphrased
using verbs like "contain" and "include", prepositions like "with" and
verbs+prepositions
like "be made from".

Unlike abstract relations such as CAUSE, CONTAINER, SOURCE, TIME, and
LOCATION,
which have traditionally been used for noun compound interpretation, verbs
and prepositions
are directly usable as paraphrases, and using multiple paraphrases
simultaneously yields
an appealing fine-grained semantic representation.


==========
 The Task
==========

In a preliminary study, we asked 25-30 human subjects to paraphrase 250
noun-noun compounds
using suitable paraphrasing verbs. For example, for "nut bread" we have the
following paraphrases
(the number of subjects who proposed each paraphrase is shown in
parentheses):

contain(21); include(10); be made with(9); have(8); be made from(5); use(3);
be made using(3);
feature(2); be filled with(2); taste like(2); be made of(2); come from(2);
consist of(2);
hold(1); be composed of(1); be blended with(1); be created out of(1);
encapsulate(1); diffuse(1);
be created with(1); be flavored with(1); incorporate(1); be created from(1);
be prepared with(1);
sink under(1); comprise(1); eat up(1); be made out of(1); wreck(1); be baked
using(1); cover over(1);
improve with(1); taste of(1); be baked with(1); rise above(1); surround(1);
be about(1)

Based on this kind of data, we propose a ranking task. The participants will
be presented
with a noun-noun compound and a set of corresponding paraphrasing verbs and
prepositions,
and will be asked to provide a ranking that is as close as possible to the
ranking
proposed by the humans.


==========
 Datasets
==========

* Trial Data: We have released as trial data the paraphrasing verbs for 250
noun compounds,
each paraphrased by 25-30 human subjects.

* Test Data: The test data will consist of noun-noun compounds and a set of
paraphrasing verbs
and prepositions associated with each of them. For each compound, the
participants will need
to produce a ranking, which will be compared to a gold-standard ranking for
that compound.
We will collect paraphrases for over 300 noun-noun compounds, each of which
will be annotated
by 100 human annotators.

License: All data are released under the Creative Commons Attribution 3.0
Unported license.


===============
 Time Schedule
===============

* Trial data released:              August 30, 2009
* Training data release:            February 17, 2010

* Test data release:                March 18, 2010
* Result submission deadline:       7 days after downloading the *test*
data, but no later than April 2

* Organizers send the test results: April 10
* Submission of description papers: April 17, 2010
* Notification of acceptance:       May 6, 2010
* SemEval'2010 workshop (at ACL):   July 15-16


=================
 Task Organizers
=================

Cristina Butnariu     University College Dublin
Su Nam Kim            University of Melbourne
Preslav Nakov         National University of Singapore
Diarmuid Ó Séaghdha   University of Cambridge
Stan Szpakowicz       University of Ottawa
Tony Veale            University College Dublin


==============
 Useful Links
==============

Interested in participating in the shared task? Please join the following
Google group:
http://groups.google.com.sg/group/semeval-2010-noun-compound-interpretation-
using-verbs?hl=en

Task #9 website: http://docs.google.com/View?docid=dfvxd49s_35hkprbcpt

SemEval 2010 website: http://semeval2.fbk.eu/semeval2.php



_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list