<html>

  <head>

    <meta http-equiv="content-type" content="text/html;

      charset=ISO-8859-1">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    [Apologies for multiple postings]<br>

    <br>

    **1st Call for Papers**<br>

    LREC 2012 Workshop on: Language Resource Merging<br>

    <br>

    22 May 2012 – Afternoon Session<br>

    <br>

    <b>CONTEXT</b><br>

    The availability of adequate language resources has been a

    well-known bottleneck for most<br>

    high-level language technology applications, e.g. Machine

    Translation, parsing, and<br>

    Information Extraction, for at least 15 years , and the impact of

    the bottleneck is becoming all<br>

    the more apparent with the availability of higher computational

    power and massive storage,<br>

    since modern language technologies are capable of using far more

    resources than the<br>

    community produces. The present landscape is characterized by the

    existence of numerous<br>

    scattered resources, many of which have differing levels of

    coverage, types of information and<br>

    granularity. Taken singularly, existing resources do not have

    sufficient coverage, quality or<br>

    richness for robust large-scale applications, and yet they contain

    valuable information<br>

    (Monachini et al. 2004 and 2006; Soria et al. 2006; Molinero, Sagot

    and Nicolas 2009;<br>

    Necsulescu et al. 2011). Differing technology or application

    requirements, ignorance of the<br>

    existence of certain resources, and difficulties in accessing and

    using them, has led to the<br>

    proliferation of multiple, unconnected resources that, if merged,

    could constitute a much<br>

    richer repository of information augmenting either coverage or

    granularity, or both, and<br>

    consequently multiplying the number of potential language technology

    applications. Merging,<br>

    combining and/or compiling larger resources from existing ones thus

    appears to be a<br>

    promising direction to take.<br>

    The re-use and merging of existing resources is not altogether

    unknown. For example,<br>

    WordNet (Fellbaum, 1998) has been successfully reused in a variety

    of applications. But this is<br>

    the exception rather than the rule; in fact, merging, and enhancing

    existing resources is<br>

    uncommon, probably because it is by no means a trivial task given

    the profound differences in<br>

    formats, formalisms, metadata, and linguistic assumptions.<br>

    The language resource landscape is on the brink of a large change,

    however. With the<br>

    proliferation of accessible metadata catalogues, and resource

    repositories (such as the new<br>

    META-SHARE (<a class="moz-txt-link-freetext" href="http://www.meta-net.eu/meta-share">http://www.meta-net.eu/meta-share</a>) infrastructure), a

    potentially large<br>

    number of existing resources will be more easily located, accessed

    and downloaded. Also, with<br>

    the advent of distributed platforms for the automatic production of

    language resources, such<br>

    as PANACEA (<a class="moz-txt-link-freetext" href="http://www.panacea-lr.eu/">http://www.panacea-lr.eu/</a>), new language resources and

    linguistic information<br>

    capable of being integrated into those resources will be produced

    more easily and at a lower<br>

    cost. Thus, it is likely that researchers and application developers

    will seek out resources<br>

    already available before developing new, costly ones, and will

    require methods for<br>

    merging/combining various resources and adapting them to their

    specific needs.<br>

    Up to the present day, most resource merging has been done manually,

    with only a small<br>

    number of attempts reported in the literature towards

    (semi-)automatic merging of resources<br>

    (Crouch & King 2005; Pustejovsky et al. 2005; Molinero, Sagot

    and Nicolas 2009; Necsulescu et<br>

    al. 2011). In order to take a further step towards the scenario

    depicted above, in which<br>

    resource merging and enhancing is a reliable and accessible first

    step for researchers and<br>

    application developers, experience and best practices must be shared

    and discussed, as this<br>

    will help the whole community avoid any waste of time and resources.<br>

    <b><br>

      AIMS OF THE WORKSHOP</b><br>

    This half-day workshop is meant to be part of a series of meetings

    constituting an ongoing<br>

    forum for sharing and evaluating the results of different methods

    and systems for the<br>

    automatic production of language resources (the first one was the

    LREC 2010 Workshop on<br>

    Methods for the Automatic Production of Language Resources and their

    Evaluation Methods).<br>

    The main focus of this workshop is on (semi-)automatic means of

    merging language resources,<br>

    such as lexicons, corpora and grammars. Merging makes it possible to

    re-use, adapt, and<br>

    enhance existing resources, alongside new, automatically created

    ones, with the goal of<br>

    reducing the manual intervention required in language resource

    production, and thus<br>

    ultimately production costs.<br>

    <br>

    <b>WORKSHOP TOPICS</b><br>

    The topics of the workshop are related to best practices, methods,

    techniques and<br>

    experimental results regarding the merging of various types of

    language resources, such as<br>

    lexicons and corpora, especially in support of language technology

    applications. In particular,<br>

    new methods for automatic merging with a view towards reducing human

    intervention will be<br>

    most welcome.<br>

    Topics for submission include, but are not limited to:<br>

    - Experiments on (semi-)automatic merging of automatically produced

    resources<br>

    - Experiments on the merging of two or more existing resources

    containing the same or<br>

    different levels of linguistic information<br>

    - Studies or experiments on merging resources at different levels of

    granularity (corpora,<br>

    lexicons, grammars)<br>

    - Studies or experiments on unifying, mapping or converting encoding

    formats<br>

    - Comparison between different resources and mapping algorithms to

    provide desired<br>

    merging<br>

    - Use of linguistic information from different sources in high-level

    language applications<br>

    - Use of new, merged language resources in language technology

    applications<br>

    <br>

    <b>SUBMISSIONS</b><br>

    Interested participants must submit a preliminary paper of about 4-6

    pages including<br>

    references (between 2000-2500 words). For the submission please use

    the online form on<br>

    START LREC Conference Manager at:

    <a class="moz-txt-link-freetext" href="https://www.softconf.com/lrec2012/MergingLR2012/">https://www.softconf.com/lrec2012/MergingLR2012/</a><br>

    When submitting a paper from the START page, authors will be asked

    to provide essential<br>

    information about resources (in a broad sense, i.e. also

    technologies, standards, evaluation<br>

    kits, etc.) that have been used for the work described in the paper

    or are a new result of your<br>

    research.<br>

    For further information on this new initiative, please refer to

    <a class="moz-txt-link-freetext" href="http://www.lrecconf">http://www.lrecconf</a>.<br>

    org/lrec2012/?LRE-Map-2012<br>

    Papers will be peer-reviewed by the workshop Program Committee.<br>

    <br>

    <b>IMPORTANT DATES</b><br>

    · Deadline for paper submission: 15 February 2012<br>

    · Notification of acceptance: 15 March 2012<br>

    · Submission of camera-ready version of papers: 31 March 2012<br>

    · Workshop date: 22 May 2012 – Afternoon Session<br>

    <br>

    <b>ORGANIZING COMMITTEE</b><br>

    Núria Bel, UPF, Barcelona, Spain<br>

    Maria Gavrilidou, ILSP-“Athena”, Athens, Greece,<br>

    Monica Monachini, CNR-ILC, Pisa, Italy<br>

    Valeria Quochi, CNR-ILC, Pisa, Italy<br>

    Laura Rimell, University of Cambridge, UK<br>

    Contacts<br>

    <a class="moz-txt-link-abbreviated" href="mailto:lrec12_workshop_merging@ilc.cnr.it">lrec12_workshop_merging@ilc.cnr.it</a><br>

    <br>

    <b>PROGRAMME COMMITTEE:</b><br>

    Victoria Arranz, ELDA, Paris, France<br>

    Paul Buitelaaar, National University of Ireland, Galway, Ireland<br>

    Nicoletta Calzolari, CNR-ILC, Pisa, Italy<br>

    Olivier Hamon, ELDA, Paris, France<br>

    Aleš Horák, Masaryk University, Brno, Czech Republic<br>

    Nancy Ide, Vassar College, Mass. USA<br>

    Bernardo Magnini, FBK, Trento, Italy<br>

    Paola Monachesi, Utrecht University, Utrecht, The Netherlands<br>

    Jan Odijk, , Utrecht University, Utrecht, The Netherlands<br>

    Muntsa Padró, IULA, Barcellona, Spain<br>

    Karel Pala, Masaryk University, Brno, Czech Republic<br>

    Thierry Poibeau University of Cambridge, UK and CNRS, Paris, France<br>

    Benoît Sagot, INRIA, Paris, France<br>

    Kiril Simov, Bulgarian Academy of Sciences, Sofia, Bulgaria<br>

    Claudia Soria, CNR-ILC, Pisa, Italy<br>

    Maurizio Tesconi, CNR-IIT, Pisa<br>

  </body>

</html>