[Corpora-List] Call for Evaluation Resources: MWE 2008 Workshop at LREC 2008

Stefan Evert stefan.evert at uos.de
Fri Dec 21 13:40:00 UTC 2007


-- apologies for multiple postings --

################################################################

  CALL FOR EVALUATION RESOURCES

 >> LREC2008 - Towards a Shared Task for Multiword Expressions (MWE  
2008) <<

  endorsed by the ACL Special Interest Group on the Lexicon (SIGLEX)


Date: Sunday, 1 June 2008
Location: Marrakech, Morocco

Workshop web page: http://multiword.sf.net/mwe2008/

#################################################################

In recent years, considerable progress has been made in our  
understanding of
multiword expressions (MWE), the development of algorithms for their  
automatic
extraction from corpora, and the automatic identification of additional
properties such as morphosyntactic preferences or the interpretation of
semi-compositional expressions.

It is difficult to compare results of the many published studies on  
MWEs and
obtain a broader perspective, though, because algorithms and implemented
systems have been evaluated on vastly different gold standards and  
corpora, in
different languages, for different subtypes of MWEs, etc. In order to  
make the
next big step forward, the field of MWE research needs a shared task  
in which
different approaches are applied to the same data sets, allowing  
completely
new insights to be gained. Since there is as yet not a clear and  
universally
accepted definition of multiword expressions, the first instalment of  
this
shared task will be of a more exploratory nature than the  
competitions that
have been carried out in other areas of computational linguistics.

The MWE 2008 workshop is primarily intended as a forum for  
collecting, sharing
and exploiting MWE evaluation resources.  We solicit contributions of  
such
resources from the MWE community, in particular:

  (1) manually annotated data sets (MWE candidates marked as true and  
false
      positives, or as different subtypes of MWEs);

  (2) data sets of MWEs annotated with additional properties; and

  (3) lists of known MWEs, e.g. from machine-readable dictionaries.

In addition, candidate data obtained from corpora with sophisticated
proprietary NLP tools may be of interest, helping researchers to  
apply their
statistical MWE identification techniques to a broad range of languages.

The contributed resources will be made available freely for research  
purposes
on multiword.sf.net, and should be accompanied by documentation (e.g.
annotation guidelines) on the SourceForge project wiki. Contributors  
will be
invited to submit a short paper (4 pages) describing their resource and
summarising previous research carried out on these data.

After collection of the resources, teams participating in the shared  
task can
evaluate their MWE extraction algorithms on multiple data sets and  
discuss
implications for their generalisability and further development. At the
workshop, the evaluation results of the different teams will be  
summarised and
compared. A call for papers and participation in the shared task is  
being
distributed separately.


SUBMISSION INFORMATION

If you are interested in contributing a MWE evaluation resource to our
initiative, please contact us by e-mail to make further arrangements.

If you have a SourceForge account, you will be able to upload the
resource yourself and document it on the project wiki.


IMPORTANT DATES

Resource submission deadline: February 1, 2008
Paper submission deadline: February 29, 2008
Notification of acceptance: March 28, 2008
Camera-ready papers due: April 4, 2008
Workshop date: June 1, 2008


WORKSHOP CHAIRS

Nicole Grégoire
University of Utrecht, The Netherlands

Stefan Evert
University of Osnabrueck, Germany

Brigitte Krenn
Austrian Research Institute for Artificial Intelligence (ÖFAI), Austria


CONTACT

For any inquiries regarding the workshop please contact Nicole Grégoire
(Nicole.Gregoire at let.uu.nl).
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list