[Corpora-List] 1st CfP - LREC2010 Workshop: Language Resources: From Storyboard to Sustainability and LR Lifecycle Management
info at elda.org
info at elda.org
Thu Dec 31 10:35:03 UTC 2009
[Apologies for cross-postings]
**
*CALL FOR PAPERS*
Workshop on
*Language Resources: From Storyboard to Sustainability and LR Lifecycle
Management*
* *
To be held in conjunction with the 7^th International Language Resources
and Evaluation Conference (LREC 2010)
*23 May 2010, Mediterranean Conference Centre, Valletta, Malta*
http://workshops.elda.org/lrslm2010/ (under construction)
/Deadline for submission: 22 February 2010/**
*Description*
The life of a language resource (LR), from its mere conception and
drafting to its adult phases of active exploitation by the HLT
community, varies considerably. Ensuring that language resources be a
part of a sustainable and endurable living process represents a
multi-faceted challenge that certainly calls for well-planned
anti-neglecting actions to be put into action by the different actors
participating in the process. Clearing all IPR issues, exploiting best
practices at specification and production time are just a few samples of
such actions. Sustainability and lifecycle management issues are thus
concepts that should be addressed before endeavouring into any serious
LR production.
When thinking of long-term LRs a number of aspects come to our minds
which do not always succeed to be taken into account before development.
Some of these aspects are /usability/, /accessibility, interoperability/
and /scalability/, which inevitably call for a long list of neglected
points that would need to be taken into account at a very early stage of
development. Looking further into the /portability/ and /scalability/ of
a language resource, a number of dimensions should be taken into account
to ensure that a language resource reaches its adult life in an active
and productive way.
An aspect that is often neglected is the /accessibility/ and thus
/secured reusability/ of a language resource. Institutions such as ELRA
(European Language resources Association) and LDC (Linguistic Data
Consortium), at a European and American level, respectively, as well as
BAS (Bavarian Archive for Speech Signals) and TST-Centrale
(Flemish-Dutch Human Language Technology Agency), at a language-specific
level, have worked on these aspects for a large number of years. Through
their different activities, they have successfully implemented a sharing
policy which allows different users to gain access to already existing
resources. Other emerging programmes such as CLARIN (Common Language
Resources and Technology Infrastructure) are also looking into these
aspects. Nevertheless, many resources still follow development without a
long-term accessibility plan into place which makes impossible to gain
access once the resource is finished. This accessibility plan should
consider issues such as ownership rights, licensing, types of use,
aiming for a wide community from the very beginning. This accessibility
plan calls for an optimal co-operation between all actors (LR users,
financing bodies, owners, developers and organisations) so that issues
related to the life of a LR are well established, roles and actors are
clearly identified within the cycle and best practices are defined
towards the management of the entire LR lifecycle.
We are aware, though, that these above-presented ideas are but a
take-off for discussion. It is at this point that we would like to
invite the community to participate in this workshop and share with us
their views on these and other relevant issues of concern. A fruitful
discussion could lead us to finding new mechanisms to support
perpetuating language resources, and may lead us towards a
sustainability model that guarantees an appropriate and well-defined LR
storyboard and lifecycle management plan in the future.
Among the many issues and topics that may be presented and discussed
during this workshop, we would like to already suggest the following:
- Which fields require LRs and which are their respective needs?
- What needs to be part of a LR storyboard? What points are we
missing in its design?
- General specifications vs. detailed specifications and design
- Annotation frameworks and layers: interoperable at all?
- Should creation and provision of LRs be included in higher
education curriculae?
- How to plan for scalable resources?
- Language Resource maintenance and improvement: feasible?
- Sharing language resources: how to bear this in mind and
implement it? Logistics of the sharing: online vs. offline
- Centralised vs. decentralised, and national vs. international
management and maintenance of LRs
- What happens when users create updated or derived LRs?
- Sharing language resources: legal issues concerned
- Sharing language resources: pricing issues concerned,
commercial vs. non-commercial use
- Do LR actors work in a synchronised manner?
- What should be the roles of the different actors?
- What are the business models and arrangements for IPRs?
- Self-supporting vs. subsidised LR organisations
- Other general problems faced by the community
We solicit papers that address these questions and other related issues
relevant to the workshop.
*Workshop Programme and Audience Addressed*
This full-day workshop aims to address all those involved with language
resources at some point of their research/work (LR users, producers,
...) and all those with an interest in the different aspects involved,
whether universities, companies or funding agencies of some nature. It
aims to be a meeting and discussion point for the so many bottlenecks
surrounding the life of a resource and which remain to be addressed with
a sustainability plan.
The workshop features two invited talks, opening the morning and
afternoon sessions, submitted papers, and will conclude with a round
table to brainstorm on the issues raised during the presentations and
the individual discussions. This round table will be run by a number of
experts already experienced in some of the highlighted problems and in
open discussion with the workshop participants. In short, this workshop
will result in a plan of action towards a sustainability and lifecycle
management plan to implement.
*Invited Speakers*
To be announced on the workshop web site.
* *
*Organising Committee*
Victoria Arranz (Evaluations and Language resources Distribution Agency
(ELDA) / European Language resources Association (ELRA), France)
Khalid Choukri (ELDA - Evaluations and Language resources Distribution
Agency / ELRA - European Language resources Association, France)
Christopher Cieri (LDC - Linguistic Data Consortium, USA)
Laura van Eerten (Flemish-Dutch HLT Agency, Instituut voor Nederlandse
Lexicologie, The Netherlands)
Bente Maegaard (CST, University of Copenhagen, Denmark)
Stelios Piperidis (ILSP -- Institute for Language and Speech Processing
/ ELRA - European Language resources Association, France)
Remco van Veenendaal (Flemish-Dutch HLT Agency, Instituut voor
Nederlandse Lexicologie, The Netherlands)
*Programme Committee*
Núria Bel (Institut Universitari de Lingüística Aplicada, Universitat
Pompeu Fabra, Spain)
Nicoletta Calzolari (Istituto di Linguistica Computazionale del CNR
(ILC-CNR) -- Italy)
Jean Carletta (Human Communication Research Centre, School of
Informatics, University of Edinburgh, UK)
Catia Cucchiarini (Nederlandse Taalunie, The Netherlands)
Christoph Draxler (Bavarian Archive for Speech Signals, Institute of
Phonetics and Speech Processing (BAS), Germany)
Maria Gavrilidou (Institute for Language and Speech Processing (ILSP),
Greece)
Nancy Ide (Department of Computer Science, Vassar College, USA)
Steven Krauwer (UiL OTS, Utretch University, The Netherlands)
Asunción Moreno (Universitat Politècnica de Catalunya (UPC), Spain)
Dirk Roorda (Data Archiving and Networked Services, The Netherlands)
Ineke Schuurman (Centre for Computational Linguistics, Catholic
University Leuven, Belgium)
Claudia Soria (Istituto di Linguistica Computazionale del CNR (ILC-CNR)
-- Italy)
Stephanie M. Strassel (Linguistic Data Consortium (LDC), USA)
Andreas Witt (IDS Mannheim, Germany)
Peter Wittenburg (Max Planck Institute for Psycholinguistics, The
Netherlands)
*Important dates*
Deadline for abstracts: Monday 22 February 2010
Notification to Authors: Friday 12 March 2010
Submission of Final Version: Sunday 21 March 2010
Workshop: Sunday 23 May 2010
*Submission*
Abstracts should be no longer than 1500 words and should be submitted in
PDF format through the online submission form on START
(https://www.softconf.com/lrec2010/Sustainability2010/). For further
queries, please contact Victoria Arranz at arranz at elda.org
<mailto:arranz at elda.org> or Laura van Eerten at laura.vaneerten at inl.nl
<mailto:laura.vaneerten at inl.nl>.
/When submitting a paper through the START page, authors will be kindly
asked to provide relevant information about the resources that have been
used for the work described in their paper or that are the outcome of
their research. For further information on this new initiative, please
refer to
http://www.lrec-conf.org/lrec2010/?LREC2010-Map-of-Language-Resources./
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20091231/b35a0989/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list