<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">

<HTML><HEAD>

<META http-equiv=content-type 

content="text/html;
      charset=ISO-8859-15">

<META content="MSHTML 6.00.2900.5512" name=GENERATOR>

<STYLE></STYLE>

</HEAD>

<BODY text=#000000 bgColor=#ffffff>

<DIV> </DIV><B>LREC 2012 Workshop on: Language Resource Merging 

</B><BR><BR><A class=moz-txt-link-freetext 

href="http://panacea-lr.eu/en/news/project/2011/12/19/lrec-2012-merging-lr-workshop/">http://panacea-lr.eu/en/news/project/2011/12/19/lrec-2012-merging-lr-workshop/</A> 

<BR><BR>

<DIV align=center>   <BR> Date: 22 May 2012 - Afternoon Session 

<BR></DIV><BR>

<DIV align=center>Location: Istanbul, Turkey <BR></DIV><BR>

<DIV align=center>**** Deadline for paper submission EXTENDED ***** 

<BR></DIV><BR>

<DIV align=center><B><FONT color=#ff0000>**** NEW Deadline for paper submission: 

22 February 2012 ***** </FONT></B><BR></DIV><BR><SMALL>CONTEXT <BR><BR>The 

availability of adequate language resources has been a well-known bottleneck for 

most high-level language technology applications, e.g. Machine Translation, 

parsing, and Information Extraction, for at least 15 years , and the impact of 

the bottleneck is becoming all the more apparent with the availability of higher 

computational power and massive storage, since modern language technologies are 

capable of using far more resources than the community produces. The present 

landscape is characterized by the existence of numerous scattered resources, 

many of which have differing levels of coverage, types of information and 

granularity. Taken singularly, existing resources do not have sufficient 

coverage, quality or richness for robust large-scale applications, and yet they 

contain valuable information (Monachini et al. 2004 and 2006; Soria et al. 2006; 

Molinero, Sagot and Nicolas 2009; Necsulescu et al. 2011). Differing technology 

or application requirements, ignorance of the existence of certain resources, 

and difficulties in accessing and using them, has led to the proliferation of 

multiple, unconnected resources that, if merged, could constitute a much richer 

repository of information augmenting either coverage or granularity, or both, 

and consequently multiplying the number of potential language technology 

applications. Merging, combining and/or compiling larger resources from existing 

ones thus appears to be a promising direction to take. <BR><BR>The re-use and 

merging of existing resources is not altogether unknown. For example, WordNet 

(Fellbaum, 1998) has been successfully reused in a variety of applications. But 

this is the exception rather than the rule; in fact, merging, and enhancing 

existing resources is uncommon, probably because it is by no means a trivial 

task given the profound differences in formats, formalisms, metadata, and 

linguistic assumptions. <BR><BR>The language resource landscape is on the brink 

of a large change, however. With the proliferation of accessible metadata 

catalogues, and resource repositories (such as the new META-SHARE (<A 

class=moz-txt-link-freetext 

href="http://www.meta-net.eu/meta-share">http://www.meta-net.eu/meta-share</A>) 

infrastructure), a potentially large number of existing resources will be more 

easily located, accessed and downloaded. Also, with the advent of distributed 

platforms for the automatic production of language resources, such as PANACEA 

(<A class=moz-txt-link-freetext 

href="http://www.panacea-lr.eu/">http://www.panacea-lr.eu/</A>), new language 

resources and linguistic information capable of being integrated into those 

resources will be produced more easily and at a lower cost. Thus, it is likely 

that researchers and application developers will seek out resources already 

available before developing new, costly ones, and will require methods for 

merging/combining various resources and adapting them to their specific needs. 

<BR><BR>Up to the present day, most resource merging has been done manually, 

with only a small number of attempts reported in the literature towards 

(semi-)automatic merging of resources (Crouch & King 2005; Pustejovsky et 

al. 2005; Molinero, Sagot and Nicolas 2009; Necsulescu et al. 2011). In order to 

take a further step  towards the scenario depicted above, in which resource 

merging and enhancing is a reliable and accessible first step for researchers 

and application developers, experience and best practices must be shared and 

discussed, as this will help the whole community avoid any waste of time and 

resources. <BR><BR>AIMS OF THE WORKSHOP <BR><BR>This half-day workshop is meant 

to be part of a series of meetings constituting an ongoing forum for sharing and 

evaluating the results of different methods and systems for the automatic 

production of language resources (the first one was the LREC 2010 Workshop on 

Methods for the Automatic Production of Language Resources and their Evaluation 

Methods). The main focus of this workshop is on (semi-)automatic means of 

merging language resources, such as lexicons, corpora and grammars. Merging 

makes it possible to re-use, adapt, and enhance existing resources, alongside 

new, automatically created ones, with the goal of reducing the manual 

intervention required in language resource production, and thus ultimately 

production costs. <BR><BR>WORKSHOP TOPICS <BR><BR>The topics of the workshop are 

related to best practices, methods, techniques and experimental results 

regarding the merging of various types of language resources, such as lexicons 

and corpora, especially in support of language technology applications. In 

particular, new methods for automatic merging with a view towards reducing human 

intervention will be most welcome. <BR><BR>Topics for submission include, but 

are not limited to: <BR><BR>-       Experiments on 

(semi-)automatic merging of automatically produced resources 

<BR><BR>-       Experiments on the merging of two 

or more existing resources containing the same or different levels of linguistic 

information <BR><BR>-       Studies or experiments 

on merging resources at different levels of granularity (corpora, lexicons, 

grammars) <BR><BR>-       Studies or experiments 

on unifying, mapping or converting encoding formats 

<BR><BR>-       Comparison between different 

resources and mapping algorithms to provide desired merging 

<BR><BR>-       Use of linguistic information from 

different sources in high-level language applications 

<BR><BR>-       Use of new, merged language 

resources in language technology applications <BR><BR>SUBMISSIONS 

<BR><BR>Interested participants must submit a preliminary paper of about 4-6 

pages including references (between 2000-2500 words). For the submission please 

use the online form on START LREC Conference Manager at: <A 

class=moz-txt-link-freetext 

href="https://www.softconf.com/lrec2012/MergingLR2012/">https://www.softconf.com/lrec2012/MergingLR2012/</A> 

<BR><BR>When submitting a paper from the START page, authors will be asked to 

provide essential information about resources (in a broad sense, i.e. also 

technologies, standards, evaluation kits, etc.) that have been used for the work 

described in the paper or are a new result of your research. <BR><BR>For further 

information on this new initiative, please refer to <A 

class=moz-txt-link-freetext 

href="http://www.lrec-conf.org/lrec2012/?LRE-Map-2012">http://www.lrec-conf.org/lrec2012/?LRE-Map-2012</A> 

<BR><BR>Papers will be peer-reviewed by the workshop Program Committee. 

<BR><BR>IMPORTANT DATES <BR><BR>-       Deadline 

for paper submission: 22 February 2012 

<BR><BR>-       Notification of acceptance: 15 

March 2012 <BR><BR>-       Submission of 

camera-ready version of papers: 31 March 2012 

<BR><BR>-       Workshop date: 22 May 2012 - 

Afternoon Session <BR><BR>CONTACT <BR><BR><A class=moz-txt-link-abbreviated 

href="mailto:lrec12_workshop_merging@ilc.cnr.it">lrec12_workshop_merging@ilc.cnr.it</A> 

<BR><BR><BR>ORGANIZING COMMITTEE <BR><BR>Núria Bel, UPF, Barcelona, Spain 

<BR><BR>Maria Gavrilidou, ILSP-Athena, Athens, Greece, <BR><BR>Monica Monachini, 

CNR-ILC, Pisa, Italy <BR><BR>Valeria Quochi, CNR-ILC, Pisa, Italy <BR><BR>Laura 

Rimell, University of Cambridge, UK <BR><BR><BR>PROGRAMME COMMITTEE: 

<BR><BR>Victoria Arranz, ELDA, Paris, France <BR><BR>Paul Buitelaaar, National 

University of Ireland, Galway, Ireland <BR><BR>Nicoletta Calzolari, CNR-ILC, 

Pisa, Italy <BR><BR>Olivier Hamon, ELDA, Paris, France <BR><BR>Ales Horák, 

Masaryk University, Brno, Czech Republic <BR><BR>Nancy Ide, Vassar College, 

Mass. USA <BR><BR>Bernardo Magnini, FBK, Trento, Italy <BR><BR>Paola Monachesi, 

Utrecht University, Utrecht, The Netherlands <BR><BR>Jan Odijk, , Utrecht 

University, Utrecht, The Netherlands <BR><BR>Muntsa Padró, IULA, Barcellona, 

Spain <BR><BR>Karel Pala, Masaryk University, Brno, Czech Republic 

<BR><BR>Thierry Poibeau University of Cambridge, UK and CNRS, Paris, France 

<BR><BR>Benoît Sagot, INRIA, Paris, France <BR><BR>Kiril Simov, Bulgarian 

Academy of Sciences, Sofia, Bulgaria <BR><BR>Claudia Soria, CNR-ILC, Pisa, Italy 

<BR><BR>Maurizio Tesconi, CNR-IIT, Pisa </SMALL><BR><PRE class=moz-signature cols="72">-- 

Monica Monachini     <A class=moz-txt-link-abbreviated href="mailto:monica.monachini@ilc.cnr.it">monica.monachini@ilc.cnr.it</A>

Istituto di Linguistica Computazionale 

Consiglio Nazionale delle Ricerche

Via Moruzzi 1

56124 Pisa - Italy

tel: +39 050 315 2852 (direct)

     +39 338 888 1164 (mobile)

fax: +39 050 315 2839


<A class=moz-txt-link-freetext href="http://www.ilc.cnr.it/">http://www.ilc.cnr.it/</A>

</PRE>

<P>

<HR>


<P></P>_______________________________________________<BR>FLaReNet_Subscribers 

mailing 

list<BR>FLaReNet_Subscribers@ilc.cnr.it<BR>https://mail.ilc.cnr.it/mailman/listinfo/flarenet_subscribers<BR></BODY></HTML>