<html>
<head>
<meta http-equiv="content-type" content="text/html;
charset=ISO-8859-1">
</head>
<body bgcolor="#FFFFFF" text="#000000">
[Apologies for multiple postings]<br>
<br>
**1st Call for Papers**<br>
LREC 2012 Workshop on: Language Resource Merging<br>
<br>
22 May 2012 – Afternoon Session<br>
<br>
<b>CONTEXT</b><br>
The availability of adequate language resources has been a
well-known bottleneck for most<br>
high-level language technology applications, e.g. Machine
Translation, parsing, and<br>
Information Extraction, for at least 15 years , and the impact of
the bottleneck is becoming all<br>
the more apparent with the availability of higher computational
power and massive storage,<br>
since modern language technologies are capable of using far more
resources than the<br>
community produces. The present landscape is characterized by the
existence of numerous<br>
scattered resources, many of which have differing levels of
coverage, types of information and<br>
granularity. Taken singularly, existing resources do not have
sufficient coverage, quality or<br>
richness for robust large-scale applications, and yet they contain
valuable information<br>
(Monachini et al. 2004 and 2006; Soria et al. 2006; Molinero, Sagot
and Nicolas 2009;<br>
Necsulescu et al. 2011). Differing technology or application
requirements, ignorance of the<br>
existence of certain resources, and difficulties in accessing and
using them, has led to the<br>
proliferation of multiple, unconnected resources that, if merged,
could constitute a much<br>
richer repository of information augmenting either coverage or
granularity, or both, and<br>
consequently multiplying the number of potential language technology
applications. Merging,<br>
combining and/or compiling larger resources from existing ones thus
appears to be a<br>
promising direction to take.<br>
The re-use and merging of existing resources is not altogether
unknown. For example,<br>
WordNet (Fellbaum, 1998) has been successfully reused in a variety
of applications. But this is<br>
the exception rather than the rule; in fact, merging, and enhancing
existing resources is<br>
uncommon, probably because it is by no means a trivial task given
the profound differences in<br>
formats, formalisms, metadata, and linguistic assumptions.<br>
The language resource landscape is on the brink of a large change,
however. With the<br>
proliferation of accessible metadata catalogues, and resource
repositories (such as the new<br>
META-SHARE (<a class="moz-txt-link-freetext" href="http://www.meta-net.eu/meta-share">http://www.meta-net.eu/meta-share</a>) infrastructure), a
potentially large<br>
number of existing resources will be more easily located, accessed
and downloaded. Also, with<br>
the advent of distributed platforms for the automatic production of
language resources, such<br>
as PANACEA (<a class="moz-txt-link-freetext" href="http://www.panacea-lr.eu/">http://www.panacea-lr.eu/</a>), new language resources and
linguistic information<br>
capable of being integrated into those resources will be produced
more easily and at a lower<br>
cost. Thus, it is likely that researchers and application developers
will seek out resources<br>
already available before developing new, costly ones, and will
require methods for<br>
merging/combining various resources and adapting them to their
specific needs.<br>
Up to the present day, most resource merging has been done manually,
with only a small<br>
number of attempts reported in the literature towards
(semi-)automatic merging of resources<br>
(Crouch & King 2005; Pustejovsky et al. 2005; Molinero, Sagot
and Nicolas 2009; Necsulescu et<br>
al. 2011). In order to take a further step towards the scenario
depicted above, in which<br>
resource merging and enhancing is a reliable and accessible first
step for researchers and<br>
application developers, experience and best practices must be shared
and discussed, as this<br>
will help the whole community avoid any waste of time and resources.<br>
<b><br>
AIMS OF THE WORKSHOP</b><br>
This half-day workshop is meant to be part of a series of meetings
constituting an ongoing<br>
forum for sharing and evaluating the results of different methods
and systems for the<br>
automatic production of language resources (the first one was the
LREC 2010 Workshop on<br>
Methods for the Automatic Production of Language Resources and their
Evaluation Methods).<br>
The main focus of this workshop is on (semi-)automatic means of
merging language resources,<br>
such as lexicons, corpora and grammars. Merging makes it possible to
re-use, adapt, and<br>
enhance existing resources, alongside new, automatically created
ones, with the goal of<br>
reducing the manual intervention required in language resource
production, and thus<br>
ultimately production costs.<br>
<br>
<b>WORKSHOP TOPICS</b><br>
The topics of the workshop are related to best practices, methods,
techniques and<br>
experimental results regarding the merging of various types of
language resources, such as<br>
lexicons and corpora, especially in support of language technology
applications. In particular,<br>
new methods for automatic merging with a view towards reducing human
intervention will be<br>
most welcome.<br>
Topics for submission include, but are not limited to:<br>
- Experiments on (semi-)automatic merging of automatically produced
resources<br>
- Experiments on the merging of two or more existing resources
containing the same or<br>
different levels of linguistic information<br>
- Studies or experiments on merging resources at different levels of
granularity (corpora,<br>
lexicons, grammars)<br>
- Studies or experiments on unifying, mapping or converting encoding
formats<br>
- Comparison between different resources and mapping algorithms to
provide desired<br>
merging<br>
- Use of linguistic information from different sources in high-level
language applications<br>
- Use of new, merged language resources in language technology
applications<br>
<br>
<b>SUBMISSIONS</b><br>
Interested participants must submit a preliminary paper of about 4-6
pages including<br>
references (between 2000-2500 words). For the submission please use
the online form on<br>
START LREC Conference Manager at:
<a class="moz-txt-link-freetext" href="https://www.softconf.com/lrec2012/MergingLR2012/">https://www.softconf.com/lrec2012/MergingLR2012/</a><br>
When submitting a paper from the START page, authors will be asked
to provide essential<br>
information about resources (in a broad sense, i.e. also
technologies, standards, evaluation<br>
kits, etc.) that have been used for the work described in the paper
or are a new result of your<br>
research.<br>
For further information on this new initiative, please refer to
<a class="moz-txt-link-freetext" href="http://www.lrecconf">http://www.lrecconf</a>.<br>
org/lrec2012/?LRE-Map-2012<br>
Papers will be peer-reviewed by the workshop Program Committee.<br>
<br>
<b>IMPORTANT DATES</b><br>
· Deadline for paper submission: 15 February 2012<br>
· Notification of acceptance: 15 March 2012<br>
· Submission of camera-ready version of papers: 31 March 2012<br>
· Workshop date: 22 May 2012 – Afternoon Session<br>
<br>
<b>ORGANIZING COMMITTEE</b><br>
Núria Bel, UPF, Barcelona, Spain<br>
Maria Gavrilidou, ILSP-“Athena”, Athens, Greece,<br>
Monica Monachini, CNR-ILC, Pisa, Italy<br>
Valeria Quochi, CNR-ILC, Pisa, Italy<br>
Laura Rimell, University of Cambridge, UK<br>
Contacts<br>
<a class="moz-txt-link-abbreviated" href="mailto:lrec12_workshop_merging@ilc.cnr.it">lrec12_workshop_merging@ilc.cnr.it</a><br>
<br>
<b>PROGRAMME COMMITTEE:</b><br>
Victoria Arranz, ELDA, Paris, France<br>
Paul Buitelaaar, National University of Ireland, Galway, Ireland<br>
Nicoletta Calzolari, CNR-ILC, Pisa, Italy<br>
Olivier Hamon, ELDA, Paris, France<br>
Aleš Horák, Masaryk University, Brno, Czech Republic<br>
Nancy Ide, Vassar College, Mass. USA<br>
Bernardo Magnini, FBK, Trento, Italy<br>
Paola Monachesi, Utrecht University, Utrecht, The Netherlands<br>
Jan Odijk, , Utrecht University, Utrecht, The Netherlands<br>
Muntsa Padró, IULA, Barcellona, Spain<br>
Karel Pala, Masaryk University, Brno, Czech Republic<br>
Thierry Poibeau University of Cambridge, UK and CNRS, Paris, France<br>
Benoît Sagot, INRIA, Paris, France<br>
Kiril Simov, Bulgarian Academy of Sciences, Sofia, Bulgaria<br>
Claudia Soria, CNR-ILC, Pisa, Italy<br>
Maurizio Tesconi, CNR-IIT, Pisa<br>
</body>
</html>