Appel: LREC 2010 Workshop on Language Resources (extended)

Thierry Hamon thierry.hamon at UNIV-PARIS13.FR
Tue Feb 23 16:42:39 UTC 2010


Date: Tue, 23 Feb 2010 10:39:07 +0100
From: info at elda.org
Message-ID: <4B83A23B.1040508 at elda.org>
X-url: http://workshops.elda.org/lrslm2010/
X-url: http://www.lrec-conf.org/lrec2010/?LREC2010-Map-of-Language-Resources


                           CALL FOR PAPERS

                             Workshop on

Language Resources: From Storyboard to Sustainability and LR Lifecycle
                              Management

To be held in conjunction with the 7^th International Language Resources and
                  Evaluation Conference (LREC 2010)

    23 May 2010, Mediterranean Conference Centre, Valletta, Malta

                 http://workshops.elda.org/lrslm2010/

            Extended deadline for submission: 1 March 2010

Description

The life of a language resource (LR), >From its mere conception and
drafting to its adult phases of active exploitation by the HLT
community, varies considerably. Ensuring that language resources be a
part of a sustainable and endurable living process represents a
multi-faceted challenge that certainly calls for well-planned
anti-neglecting actions to be put into action by the different actors
participating in the process. Clearing all IPR issues, exploiting best
practices at specification and production time are just a few samples
of such actions. Sustainability and lifecycle management issues are
thus concepts that should be addressed before endeavouring into any
serious LR production.

When thinking of long-term LRs a number of aspects come to our minds
which do not always succeed to be taken into account before
development. Some of these aspects are usability, accessibility,
interoperability and scalability, which inevitably call for a long
list of neglected points that would need to be taken into account at a
very early stage of development. Looking further into the portability
and scalability of a language resource, a number of dimensions should
be taken into account to ensure that a language resource reaches its
adult life in an active and productive way.

An aspect that is often neglected is the accessibility and thus
secured reusability of a language resource. Institutions such as ELRA
(European Language resources Association) and LDC (Linguistic Data
Consortium), at a European and American level, respectively, as well
as BAS (Bavarian Archive for Speech Signals) and TST-Centrale
(Flemish-Dutch Human Language Technology Agency), at a
language-specific level, have worked on these aspects for a large
number of years. Through their different activities, they have
successfully implemented a sharing policy which allows different users
to gain access to already existing resources. Other emerging
programmes such as CLARIN (Common Language Resources and Technology
Infrastructure) are also looking into these aspects. Nevertheless,
many resources still follow development without a long-term
accessibility plan into place which makes impossible to gain access
once the resource is finished. This accessibility plan should consider
issues such as ownership rights, licensing, types of use, aiming for a
wide community from the very beginning. This accessibility plan calls
for an optimal co-operation between all actors (LR users, financing
bodies, owners, developers and organisations) so that issues related
to the life of a LR are well established, roles and actors are clearly
identified within the cycle and best practices are defined towards the
management of the entire LR lifecycle.

We are aware, though, that these above-presented ideas are but a
take-off for discussion. It is at this point that we would like to
invite the community to participate in this workshop and share with us
their views on these and other relevant issues of concern. A fruitful
discussion could lead us to finding new mechanisms to support
perpetuating language resources, and may lead us towards a
sustainability model that guarantees an appropriate and well-defined
LR storyboard and lifecycle management plan in the future.

Among the many issues and topics that may be presented and discussed
during this workshop, we would like to already suggest the following:

- Which fields require LRs and which are their respective needs?

- What needs to be part of a LR storyboard? What points are we missing
  in its design?

- General specifications vs. detailed specifications and design

- Annotation frameworks and layers: interoperable at all?

- Should creation and provision of LRs be included in higher education
  curriculae?

- How to plan for scalable resources?

- Language Resource maintenance and improvement: feasible?

- Sharing language resources: how to bear this in mind and implement
  it? Logistics of the sharing: online vs. offline

- Centralised vs. decentralised, and national vs. international
  management and maintenance of LRs

- What happens when users create updated or derived LRs?

- Sharing language resources: legal issues concerned

- Sharing language resources: pricing issues concerned, commercial vs.
  non-commercial use

- Do LR actors work in a synchronised manner?

- What should be the roles of the different actors?

- What are the business models and arrangements for IPRs?

- Self-supporting vs. subsidised LR organisations

- Other general problems faced by the community


We solicit papers that address these questions and other related
issues relevant to the workshop.

Workshop Programme and Audience Addressed

This full-day workshop aims to address all those involved with
language resources at some point of their research/work (LR users,
producers, ...) and all those with an interest in the different
aspects involved, whether universities, companies or funding agencies
of some nature. It aims to be a meeting and discussion point for the
so many bottlenecks surrounding the life of a resource and which
remain to be addressed with a sustainability plan.

The workshop features two invited talks, opening the morning and
afternoon sessions, submitted papers, and will conclude with a round
table to brainstorm on the issues raised during the presentations and
the individual discussions.  This round table will be run by a number
of experts already experienced in some of the highlighted problems and
in open discussion with the workshop participants. In short, this
workshop will result in a plan of action towards a sustainability and
lifecycle management plan to implement.

Invited Speakers

To be announced on the workshop web site.

 

Organising Committee

Victoria Arranz (ELDA - Evaluations and Language resources
Distribution Agency / ELRA - European Language resources Association,
France)

Khalid Choukri (ELDA - Evaluations and Language resources Distribution
Agency / ELRA - European Language resources Association, France)

Christopher Cieri (LDC - Linguistic Data Consortium, USA)

Laura van Eerten (Flemish-Dutch HLT Agency, Instituut voor Nederlandse
Lexicologie, The Netherlands)

Bente Maegaard (CST, University of Copenhagen, Denmark)

Stelios Piperidis (ILSP ? Institute for Language and Speech Processing
/ ELRA - European Language resources Association, France)

Remco van Veenendaal (Flemish-Dutch HLT Agency, Instituut voor
Nederlandse Lexicologie, The Netherlands)

 

Programme Committee

Núria Bel (Institut Universitari de Lingüística Aplicada, Universitat
Pompeu Fabra, Spain)

Nicoletta Calzolari (Istituto di Linguistica Computazionale del CNR
(ILC-CNR) ?  Italy)

Jean Carletta (Human Communication Research Centre, School of
Informatics, University of Edinburgh, UK)

Catia Cucchiarini (Nederlandse Taalunie, The Netherlands)

Christoph Draxler (Bavarian Archive for Speech Signals, Institute of
Phonetics and Speech Processing (BAS), Germany)

Maria Gavrilidou (Institute for Language and Speech Processing (ILSP),
Greece)

Nancy Ide (Department of Computer Science, Vassar College, USA)

Steven Krauwer (UiL OTS, Utretch University, The Netherlands)

Asunción Moreno (Universitat Politècnica de Catalunya (UPC), Spain)

Dirk Roorda (Data Archiving and Networked Services, The Netherlands)

Ineke Schuurman (Centre for Computational Linguistics, Catholic
University Leuven, Belgium)

Claudia Soria (Istituto di Linguistica Computazionale del CNR
(ILC-CNR) ?  Italy)

Stephanie M. Strassel (Linguistic Data Consortium (LDC), USA)

Andreas Witt (IDS Mannheim, Germany)

Peter Wittenburg (Max Planck Institute for Psycholinguistics, The
Netherlands)

Important dates

Deadline for abstracts: Monday 1 March 2010

Notification to Authors: Friday 19 March 2010

Submission of Final Version: Wednesday 31 March 2010

Workshop: Sunday 23 May 2010

Submission

Abstracts should be no longer than 1500 words and should be submitted
in PDF format through the online submission form on START
(https://www.softconf.com/ lrec2010/Sustainability2010/). For further
queries, please contact Victoria Arranz at arranz at elda.org or Laura
van Eerten at laura.vaneerten at inl.nl.

When submitting a paper through the START page, authors will be kindly
asked to provide relevant information about the resources that have
been used for the work described in their paper or that are the
outcome of their research. For further information on this new
initiative, please refer to http://
www.lrec-conf.org/lrec2010/?LREC2010-Map-of-Language-Resources.

-------------------------------------------------------------------------
Message diffuse par la liste Langage Naturel <LN at cines.fr>
Informations, abonnement : http://www.atala.org/article.php3?id_article=48
English version       : 
Archives                 : http://listserv.linguistlist.org/archives/ln.html
                                http://liste.cines.fr/info/ln

La liste LN est parrainee par l'ATALA (Association pour le Traitement
Automatique des Langues)
Information et adhesion  : http://www.atala.org/
-------------------------------------------------------------------------



More information about the Ln mailing list