[Corpora-List] Call for Evaluation Resources: MWE 2008 Workshop at LREC 2008
Stefan Evert
stefan.evert at uos.de
Fri Dec 21 13:40:00 UTC 2007
-- apologies for multiple postings --
################################################################
CALL FOR EVALUATION RESOURCES
>> LREC2008 - Towards a Shared Task for Multiword Expressions (MWE
2008) <<
endorsed by the ACL Special Interest Group on the Lexicon (SIGLEX)
Date: Sunday, 1 June 2008
Location: Marrakech, Morocco
Workshop web page: http://multiword.sf.net/mwe2008/
#################################################################
In recent years, considerable progress has been made in our
understanding of
multiword expressions (MWE), the development of algorithms for their
automatic
extraction from corpora, and the automatic identification of additional
properties such as morphosyntactic preferences or the interpretation of
semi-compositional expressions.
It is difficult to compare results of the many published studies on
MWEs and
obtain a broader perspective, though, because algorithms and implemented
systems have been evaluated on vastly different gold standards and
corpora, in
different languages, for different subtypes of MWEs, etc. In order to
make the
next big step forward, the field of MWE research needs a shared task
in which
different approaches are applied to the same data sets, allowing
completely
new insights to be gained. Since there is as yet not a clear and
universally
accepted definition of multiword expressions, the first instalment of
this
shared task will be of a more exploratory nature than the
competitions that
have been carried out in other areas of computational linguistics.
The MWE 2008 workshop is primarily intended as a forum for
collecting, sharing
and exploiting MWE evaluation resources. We solicit contributions of
such
resources from the MWE community, in particular:
(1) manually annotated data sets (MWE candidates marked as true and
false
positives, or as different subtypes of MWEs);
(2) data sets of MWEs annotated with additional properties; and
(3) lists of known MWEs, e.g. from machine-readable dictionaries.
In addition, candidate data obtained from corpora with sophisticated
proprietary NLP tools may be of interest, helping researchers to
apply their
statistical MWE identification techniques to a broad range of languages.
The contributed resources will be made available freely for research
purposes
on multiword.sf.net, and should be accompanied by documentation (e.g.
annotation guidelines) on the SourceForge project wiki. Contributors
will be
invited to submit a short paper (4 pages) describing their resource and
summarising previous research carried out on these data.
After collection of the resources, teams participating in the shared
task can
evaluate their MWE extraction algorithms on multiple data sets and
discuss
implications for their generalisability and further development. At the
workshop, the evaluation results of the different teams will be
summarised and
compared. A call for papers and participation in the shared task is
being
distributed separately.
SUBMISSION INFORMATION
If you are interested in contributing a MWE evaluation resource to our
initiative, please contact us by e-mail to make further arrangements.
If you have a SourceForge account, you will be able to upload the
resource yourself and document it on the project wiki.
IMPORTANT DATES
Resource submission deadline: February 1, 2008
Paper submission deadline: February 29, 2008
Notification of acceptance: March 28, 2008
Camera-ready papers due: April 4, 2008
Workshop date: June 1, 2008
WORKSHOP CHAIRS
Nicole Grégoire
University of Utrecht, The Netherlands
Stefan Evert
University of Osnabrueck, Germany
Brigitte Krenn
Austrian Research Institute for Artificial Intelligence (ÖFAI), Austria
CONTACT
For any inquiries regarding the workshop please contact Nicole Grégoire
(Nicole.Gregoire at let.uu.nl).
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list