Appel: MMIES2: Multi-source, Multilingual Information Extraction and Summarization Workshop

Thierry Hamon thierry.hamon at LIPN.UNIV-PARIS13.FR
Tue Mar 18 08:49:59 UTC 2008

Date: Mon, 17 Mar 2008 10:37:31 +0100
From: Thierry Poibeau <Thierry.Poibeau at>
Message-Id: <ba9feba488e2cd9d88dfedfce4b5b898 at>

MMIES2: Multi-source, Multilingual Information Extraction and  
Summarization Workshop

Manchester, 23 August, 2008

Held in conjunction with COLING-2008 (,
the 22nd International Conference on Computational Linguistics, 18-22  
August,  2008.


The objective of the 2nd MMIES Workshop: Multi-source, Multilingual
Information Extraction and Summarization is to bring together
researchers and practitioners in the areas of extraction,
summarization, and other information access technologies, to discuss
recent approaches to multi-source and multi-lingual challenges.
Approaches to handling the idiosyncratic nature of the new Web2.0
media are especially welcome, including: mixed input, new jargon,
ungrammatical and mixed-language input, and emotional discourse.

Workshop Web Site:


* Sivaji Bandyopadhyay (Jadavpur University, India)
* Thierry Poibeau (CNRS / Universite Paris 13, France)
* Horacio Saggion (University of Sheffield, UK)
* Roman Yangarber (University of Helsinki, Finland)

Call for Papers

Information extraction (IE) and text summarization (TS) are key
technologies aiming at extracting from texts information that is
relevant to a user's interest, and presenting it to the user in
concise form. The on-going information explosion makes IE and TS
particularly critical for successful functioning within the
information society.  These technologies, however, face new challenges
with the adoption of the Web 2.0 paradigm (e.g. blogs, wikis) because
of their inherent multi-source nature.  These technologies must no
longer only deal with isolated texts or single narratives, but with
large-scale repositories or sources -- possibly in several languages
-- containing a multiplicity of views, opinions, or commentaries on
particular topics, entities or events.  There is thus a need to adapt
and/or develop new techniques to deal with these new phenomena.

Recognising similar information across different sources and/or in
different languages is of paramount importance in this multi-source,
multi-lingual context.  In information extraction, merging information
from multiple sources can lead to increased accuracy as compared with
extraction from a single source. In text summarization, similar facts
found across sources can inform sentence scoring algorithms. In
question answering, the distribution of answers in similar contexts
can inform answer ranking components.

Often, it is not the similarity of information that matters, but its
complementary nature. In a multi-lingual context, information
extraction and text summarization can provide solutions for
cross-lingual access: key pieces of information can be extracted from
different texts in one or many languages, merged, and then conveyed in
many natural languages in concise form. Applications need to be able
to cope with the idiosyncratic nature of the new Web 2.0 media: mixed
input, new jargon, ungrammatical and mixed-language input, emotional
discourse, etc.  In this context, synthesizing or inferring opinions
from multiple sources is a new and exciting challenge for NLP.  On
another level, profiling of individuals who engage in the new social
Web, and identifying whether a particular opinion is
appropriate/relevant in a given context are important topics to be

It is therefore important that the research community address the
following issues:

- What methods are appropriate to detect
similar/complementary/contradictory information? Are hand-crafted
rules and knowledge-rich approaches convenient?

- What methods are available to tackle cross-document and
cross-lingual entity and event coreference?

- What machine learning approaches are most appropriate for this task
-- supervised/unsupervised/semi-supervised?  What type of corpora are
-required for training and testing?

- What techniques are appropriate to synthesize condensed synopses of
the extracted information?  What generation techniques are useful
here?  What kind of techniques can be used to cross domains and

- What techniques can improve opinion mining and sentiment analysis
through multi-document analysis?  How do information extraction and
opinion mining connect?

- What tools exist for supporting multi-lingual/multi-source access to
information?  What solutions exist beyond full document translation to
produce cross-lingual summaries?

Important Dates:

* Paper submission deadline: ***  5 May ***
* Notification of acceptance of Papers: 6 June
* Camera-ready copy of papers due: 1 July
* Workshop: *** 23 August ***

Paper Submission:

Papers should describe original work and should indicate the state of
completion of the reported results. Wherever appropriate, concrete
evaluation results should be included. Submissions will be judged on
correctness, originality, technical strength, significance and
relevance to the conference, and interest to the attendees.

Submissions should follow the two-column format of ACL proceedings and
should not exceed eight (8) pages, including references. We strongly
recommend the use of the Coling 2008 LaTeX style files or Microsoft
Word Style files tailored for this year's conference

Submission will be electronic (pdf format only), using the START
submission webpage dedicated to the workshop

Programme Committee:

Javier Artiles (UNED, Spain)
Kalina Bontcheva (U. Sheffield, UK)
Nathalie Colineau (CSIRO, Australia)
Nigel Collier (NII, Japan)
Hercules Dalianis (KTH/Stockholm University, Sweden)
Thierry Declerk (DFKI, Germany)
Michel Généreux (LIPN-CNRS, France)
Julio Gonzalo (UNED, Spain)
Brigitte Grau (LIMSI-CNRS, France)
Ralph Grishman (New York University, USA)
Kentaro Inui (NAIST, Japan)
Min-Yen Kan (National University of Singapore, Singapore)
Guy Lapalme (U. Montreal, Canada)
Diana Maynard (U. Sheffield, UK)
Jean-Luc Minel (Modyco-CNRS, France)
Constantin Orasan (University of Wolverhampton, UK)
Cecile Paris (CSIRO, Australia)
Maria Teresa Pazienza (U. of Roma tor Vergata, Italy)
Bruno Pouliquen (European Commission - Joint Research Centre, Italy)
Satoshi Sekine (NYU, USA)
Patrick Saint-Dizier (IRIT-CNRS, France)
Agnes Sandor (Xerox XRCE, France)
Ralf Steinberger (European Commission - Joint Research Centre, Italy)
Stan Szpakowicz (University of Ottawa, Canada)
Lucy Vanderwende (Microsoft Research, USA)
Jose Luis Vicedo (Universidad de Alicante, Spain)

Additional Information:

Information about the previous MMIES Workshop, at RANLP-2007 in
Borovets, Bulgaria can be found at

Message diffuse par la liste Langage Naturel <LN at>
Informations, abonnement :
English version       : 
Archives                 :

La liste LN est parrainee par l'ATALA (Association pour le Traitement
Automatique des Langues)
Information et adhesion  :

More information about the Ln mailing list