[Corpora-List] 1st CfP - LREC2010 Workshop: Language Resources: From Storyboard to Sustainability and LR Lifecycle Management

info at elda.org info at elda.org
Thu Dec 31 10:35:03 UTC 2009


[Apologies for cross-postings]
**

*CALL FOR PAPERS*

Workshop on

*Language Resources: From Storyboard to Sustainability and LR Lifecycle 
Management*

* *

To be held in conjunction with the 7^th International Language Resources 
and Evaluation Conference (LREC 2010)

*23 May 2010, Mediterranean Conference Centre, Valletta, Malta*

http://workshops.elda.org/lrslm2010/ (under construction)

/Deadline for submission: 22 February 2010/**

 

*Description*

 

The life of a language resource (LR), from its mere conception and 
drafting to its adult phases of active exploitation by the HLT 
community, varies considerably. Ensuring that language resources be a 
part of a sustainable and endurable living process represents a 
multi-faceted challenge that certainly calls for well-planned 
anti-neglecting actions to be put into action by the different actors 
participating in the process. Clearing all IPR issues, exploiting best 
practices at specification and production time are just a few samples of 
such actions. Sustainability and lifecycle management issues are thus 
concepts that should be addressed before endeavouring into any serious 
LR production.

 

When thinking of long-term LRs a number of aspects come to our minds 
which do not always succeed to be taken into account before development. 
Some of these aspects are /usability/, /accessibility, interoperability/ 
and /scalability/, which inevitably call for a long list of neglected 
points that would need to be taken into account at a very early stage of 
development. Looking further into the /portability/ and /scalability/ of 
a language resource, a number of dimensions should be taken into account 
to ensure that a language resource reaches its adult life in an active 
and productive way.

 

An aspect that is often neglected is the /accessibility/ and thus 
/secured reusability/ of a language resource. Institutions such as ELRA 
(European Language resources Association) and LDC (Linguistic Data 
Consortium), at a European and American level, respectively, as well as 
BAS (Bavarian Archive for Speech Signals) and TST-Centrale 
(Flemish-Dutch Human Language Technology Agency), at a language-specific 
level, have worked on these aspects for a large number of years. Through 
their different activities, they have successfully implemented a sharing 
policy which allows different users to gain access to already existing 
resources. Other emerging programmes such as CLARIN (Common Language 
Resources and Technology Infrastructure) are also looking into these 
aspects. Nevertheless, many resources still follow development without a 
long-term accessibility plan into place which makes impossible to gain 
access once the resource is finished. This accessibility plan should 
consider issues such as ownership rights, licensing, types of use, 
aiming for a wide community from the very beginning. This accessibility 
plan calls for an optimal co-operation between all actors (LR users, 
financing bodies, owners, developers and organisations) so that issues 
related to the life of a LR are well established, roles and actors are 
clearly identified within the cycle and best practices are defined 
towards the management of the entire LR lifecycle.

 

We are aware, though, that these above-presented ideas are but a 
take-off for discussion. It is at this point that we would like to 
invite the community to participate in this workshop and share with us 
their views on these and other relevant issues of concern. A fruitful 
discussion could lead us to finding new mechanisms to support 
perpetuating language resources, and may lead us towards a 
sustainability model that guarantees an appropriate and well-defined LR 
storyboard and lifecycle management plan in the future.

 

Among the many issues and topics that may be presented and discussed 
during this workshop, we would like to already suggest the following:

 

-         Which fields require LRs and which are their respective needs?

-         What needs to be part of a LR storyboard? What points are we 
missing in its design?

-         General specifications vs. detailed specifications and design

-         Annotation frameworks and layers: interoperable at all?

-         Should creation and provision of LRs be included in higher 
education curriculae?

-         How to plan for scalable resources?

-         Language Resource maintenance and improvement: feasible?

-         Sharing language resources: how to bear this in mind and 
implement it? Logistics of the sharing: online vs. offline

-         Centralised vs. decentralised, and national vs. international 
management and maintenance of LRs

-         What happens when users create updated or derived LRs?

-         Sharing language resources: legal issues concerned

-         Sharing language resources: pricing issues concerned, 
commercial vs. non-commercial use

-         Do LR actors work in a synchronised manner?

-         What should be the roles of the different actors?

-         What are the business models and arrangements for IPRs?

-         Self-supporting vs. subsidised LR organisations

-         Other general problems faced by the community

 

We solicit papers that address these questions and other related issues 
relevant to the workshop.

 

*Workshop Programme and Audience Addressed*

This full-day workshop aims to address all those involved with language 
resources at some point of their research/work (LR users, producers, 
...) and all those with an interest in the different aspects involved, 
whether universities, companies or funding agencies of some nature. It 
aims to be a meeting and discussion point for the so many bottlenecks 
surrounding the life of a resource and which remain to be addressed with 
a sustainability plan.

 

The workshop features two invited talks, opening the morning and 
afternoon sessions, submitted papers, and will conclude with a round 
table to brainstorm on the issues raised during the presentations and 
the individual discussions. This round table will be run by a number of 
experts already experienced in some of the highlighted problems and in 
open discussion with the workshop participants. In short, this workshop 
will result in a plan of action towards a sustainability and lifecycle 
management plan to implement.

 

*Invited Speakers*

To be announced on the workshop web site.

* *

*Organising Committee*

Victoria Arranz (Evaluations and Language resources Distribution Agency 
(ELDA) /  European Language resources Association (ELRA), France)

Khalid Choukri (ELDA - Evaluations and Language resources Distribution 
Agency / ELRA - European Language resources Association, France)

Christopher Cieri (LDC - Linguistic Data Consortium, USA)

Laura van Eerten (Flemish-Dutch HLT Agency, Instituut voor Nederlandse 
Lexicologie, The Netherlands)

Bente Maegaard (CST, University of Copenhagen, Denmark)

Stelios Piperidis (ILSP -- Institute for Language and Speech Processing 
/ ELRA - European Language resources Association, France)

Remco van Veenendaal (Flemish-Dutch HLT Agency, Instituut voor 
Nederlandse Lexicologie, The Netherlands)

 

*Programme Committee*

Núria Bel (Institut Universitari de Lingüística Aplicada, Universitat 
Pompeu Fabra, Spain)

Nicoletta Calzolari (Istituto di Linguistica Computazionale del CNR 
(ILC-CNR) -- Italy)

Jean Carletta (Human Communication Research Centre, School of 
Informatics, University of Edinburgh, UK)

Catia Cucchiarini (Nederlandse Taalunie, The Netherlands)

Christoph Draxler (Bavarian Archive for Speech Signals, Institute of 
Phonetics and Speech Processing (BAS), Germany)
Maria Gavrilidou (Institute for Language and Speech Processing (ILSP), 
Greece)

Nancy Ide (Department of Computer Science, Vassar College, USA)

Steven Krauwer (UiL OTS, Utretch University, The Netherlands)

Asunción Moreno (Universitat Politècnica de Catalunya (UPC), Spain)

Dirk Roorda (Data Archiving and Networked Services, The Netherlands)

Ineke Schuurman (Centre for Computational Linguistics, Catholic 
University Leuven, Belgium)

Claudia Soria (Istituto di Linguistica Computazionale del CNR (ILC-CNR) 
-- Italy)

Stephanie M. Strassel (Linguistic Data Consortium (LDC), USA)

Andreas Witt (IDS Mannheim, Germany)

Peter Wittenburg (Max Planck Institute for Psycholinguistics, The 
Netherlands)

 

*Important dates*

Deadline for abstracts: Monday 22 February 2010

Notification to Authors: Friday 12 March 2010

Submission of Final Version: Sunday 21 March 2010

Workshop: Sunday 23 May 2010

 

*Submission*

Abstracts should be no longer than 1500 words and should be submitted in 
PDF format through the online submission form on START 
(https://www.softconf.com/lrec2010/Sustainability2010/). For further 
queries, please contact Victoria Arranz at arranz at elda.org 
<mailto:arranz at elda.org> or Laura van Eerten at laura.vaneerten at inl.nl 
<mailto:laura.vaneerten at inl.nl>.

 

/When submitting a paper through the START page, authors will be kindly 
asked to provide relevant information about the resources that have been 
used for the work described in their paper or that are the outcome of 
their research. For further information on this new initiative, please 
refer to 
http://www.lrec-conf.org/lrec2010/?LREC2010-Map-of-Language-Resources./

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20091231/b35a0989/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list