LREC 2012 Workshops: CfP Describing Language Resources with Metadata: Towards Flexibility and Interoperability in the Documentation of Language Resources

Fri Dec 16 17:37:36 UTC 2011

*CALL FOR PAPERS*

Workshop on

*Describing Language Resources with Metadata: *

*Towards Flexibility and Interoperability in the Documentation of 
Language Resources*

**

To be held in conjunction with the 8^th International Language Resources 
and Evaluation Conference (LREC 2012)

*22 May 2012, **Lütfi Kirdar Istanbul Exhibition and Congress Centre, 
Istanbul, Turkey*

http://workshops.elda.org/metadata2012/

/Deadline for submission: 19 February 2012/**

*Description*

The description of Language Resources (LRs) continues to be a crucial 
point in the life cycle of LRs, and more particularly, in their 
sustainable exchange. This has been so for a number of repositories or 
LR distribution centres in place (ELRA, GSK, LDC, OLAC, TST-Centrale, 
BAS, among others), who house LR catalogues following some proprietary 
metadata schema. A number of projects and initiatives have also focused 
these past few years in the sharing of LRs (ENABLER, CLARIN, FLaReNet, 
PANACEA, META-SHARE), for example for Language Technology (LT)

Based on these initiatives a consensus emerges that shows a number of 
requirements for standardized metadata:

1.There should be a common publication channel for the LR descriptions 
in the world.

2.This channel allows users to carry out easy and efficient LR data 
discovery and possible subsequent retrieval of LRs.

3.Expert knowledge is required to create the data model for the metadata 
description.

4.Subject matter experts (both researchers and LR/LT providers and 
developers) are required to provide the content for the data model.

5.The data model needs to be clear, expressive, flexible, customizable 
and interoperable.

6.Metadata have to provide for different user groups, ranging from 
providers to consumers (both individuals and organisations). This 
applies both to the information contained in the metadata and the 
supporting tool infrastructure for creating, maintaining, distributing, 
harvesting and searching the metadata.

Currently several initiatives focus on metadata. From the realm of work 
done within initiatives like ENABLER and CLARIN descended the Component 
MetaData Infrastructure (CMDI, ISO TC 37 SC 4 work item for ISO 24622), 
which allows the combination of standard data categories (for example 
from ISO 12620, isocat.org) to components, which are combined into 
metadata profiles. Early versions of this model have been operational in 
repositories such as ELRA's, which complied with the work done within 
INTERA. FLaReNet, as the result of a permanent and cyclical 
consultation, has issued a set of main recommendations where a global 
infrastructure of uniform and interoperable metadata sets appear among 
the Top Priorities for the field of LRs.For use within HLT, META-SHARE 
provides a fully-fledged schema for the description of LRs, in the 
framework of the component model, covering all the current resource 
types and media types of use, in all the stages of a resource's 
life-cycle. Our aim is to learn from one another's experiences and plans 
in this area.

Making resources available for others and putting this to a second use 
in other projects has never been more widely accepted as a sensible 
efficient way to avoid a waste of efforts and resources. However, when 
it comes to the details, there is still a vast number of problems. This 
workshop will be a forum to address issues and challenges in the 
concrete work with metadata for LRs, not restricted to a single 
initiative for archiving LRs.

The current state of the art for metadata provision allows for a very 
flexible approach, catering for the needs of different archives and 
communities, referring to common data category registries that describe 
the meaning of a data category at least to authors of metadata. 
Component models for metadata provisions are for example used by CLARIN 
and META-SHARE, but there is also an increased flexibility in other 
metadata schemas such as Dublin Core, which is usually not seen as 
appropriate for meaningful description of language resources.

Topics of interest are:

 1. Infrastructures for creating components and profiles for metadata
 2. Editing and creating metadata
 3. Porting legacy metadata
 4. Metadata as a resource
 5. Maintenance of metadata
 6. Classification of language resources
 7. Providing metadata concepts
 8. Creating components and profiles
 9. Services harvesting and interpreting metadata
10. Experience from the large LR data center catalogues: LDC, ELRA, BAS,
    and how to interoperate with them
11. Controlled vocabularies, terminology and metadata description
12. Formal models for metadata representation and standardized models of
    serialisation
13. Customization and reuse of metadata schemas
14. Plans or experiences with emerging metadata infrastructures as for
    example from CLARIN & META-SHARE
15. Experiences with the Component based metadata infrastructures
16. Integration and conversion of multiple repositories: experiences
    from META-SHARE, CESAR, METANET4U and META-NORD, etc.
17. Standardization issues for metadata

We invite submissions for full papers and system demonstrations that 
address these questions and other related issues relevant to the workshop.

*Workshop Programme and Audience Addressed*

This full-day workshop aims at bringing together technology oriented 
working groups on metadata modeling or schema creation and both 
researchers and producers creating metadata in the course of their work. 
Those interested to use metadata in their projects should get the 
insights and come out with a clear idea of how to either describe their 
LRs or convert their schema. Those who have developed recently a model 
can share their experience, and those who have specific concerns with 
interoperability of metadata schemas as developed by the various 
initiatives can open the discussion in search for joint solutions.

Tools and the tool infrastructures should also be part of the discussion 
given that the initiatives provide also editors, mappings, search 
interfaces, component and profile registries.

*Organising Committee*

Victoria Arranz (ELDA/ELRA, Paris, France, arranz at elda.org)

Daan Broeder (MPI, Nijmegen, The Netherlands, daan.broeder at mpi.nl)

Bertrand Gaiffe (ATILF, Nancy, France, Bertrand.Gaiffe at atilf.fr)

Maria Gavrilidou (Athena Research and Innovation Center, Athens, Greece, 
maria at ilsp.athena-innovation.gr)

Monica Monachini, (CNR-ILC, Pisa, Italy, monica.monachini at ilc.cnr.it)

Thorsten Trippel (University of Tübingen, Tübingen, Germany, 
thorsten.trippel at uni-tuebingen.de)

*Programme Committee*

Helen Aristar-Dry (Michigan State University, USA)

Núria Bel (UPF, Barcelona, Spain)

Antonio Branco, (University of Lisbon, Portugal)

Lars Borin (Språkbanken, Sweden)

Khalid Choukri (ELDA/ELRA, Paris, France)

Thierry Declerck (DFKI, Germany)

Matej Durco (Austrian Academy of Sciences, Austria)

Gil Francopoulo (CNRS-LIMSI-IMMI + TAGMATICA, Paris, France)

Francesca Frontini (CNR-ILC, Pisa, Italy)

Erhard Hinrichs (Univerität Tübingen, Germany)

Penny Labropoulou (ILSP-Athena, Athens, Greece)

Valérie Mapelli (ELDA/ELRA, Paris, France)

Jan Odijk (Universiteit Utrecht, The Netherlands)

Elena Pierazzo (Kings College, London, UK)

Laurent Romary (INRIA, France)

Mike Rosner (University of Malta, Malta)

Andreas Witt (IDS, Germany)

Peter Wittenburg (MPI, The Netherlands)

Tamás Varadi (Hungarian Academy of Sciences, Hungary)

Marta Villegas (UPF, Barcelona, Spain)

Sue Ellen Wright (Kent State University, USA)

*Important dates*

Submission of full papers: Sunday 19 February 2012

Notification of acceptance of papers and demonstrations: Thursday 22 
March 2012

Submission of final version: Saturday 31 March 2012

Final programme available: Friday 13 April 2012

Workshop: Tuesday 22 May 2012

*Submission*

Authors should use the START system accessible fromhttps://www.softconf.com/lrec2012/Metadata2012  and the LREC author's kit for submitting a two-column article of 4 to 8 pages.

For further queries, please contact Victoria Arranz at arranz at elda.org 
or Thorsten Trippel at thorsten.trippel at uni-tuebingen.de.

/When submitting a paper through START, authors will be kindly asked to 
provide relevant information about the resources that have been used for 
the work described in their paper or that are the outcome of their 
research. For further information on this initiative, please refer to 
http://www.lrec-conf.org/lrec2012/?LRE-Map-2012//. Authors will also be 
asked to contribute to the Language Library, the new initiative of 
LREC2012./

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/hpsg-l/attachments/20111216/f7d8e583/attachment.htm>
-------------- next part --------------