[Corpora-List] Call for Participation: LREC Workshop "Quality assurance and quality measurement for language and speech resources"

Steven Krauwer steven.krauwer at let.uu.nl
Tue Apr 18 10:31:11 UTC 2006


                   CALL FOR PARTICIPATION

                          WORKSHOP
         "Quality assurance and quality measurement
              for language and speech resources"
                   on Saturday, May 27th 2006

                     in conjunction with
         LREC 2006, The 5th INTERNATIONAL CONFERENCE
              ON LANGUAGE RESOURCES AND EVALUATION
                  Genoa, Italy, 24-26 May 2006

Workshop Programme

  Saturday, May 27

    09:15 09:30 Introduction (Steven Krauwer and Uwe Quasthoff)
    09:30 10:20 What is quality (Chris Cieri, invited talk)
    10:20 10:40 Validation of third party Spoken and Written Language
                Resources  - Methods for performing Quick Quality Checks
                (Hanne Fers›e, Henk van den Heuvel, Sussi Olsen)
    10:40 11:00 Improving  the  Quality of FrameNet (J. Scheffczyk, M.
                Ellsworth)
    11:00 11:30 BREAK
    11:30 12:10 Valid Validations: Bare Basics and Proven Procedures
                (Henk van den Heuvel, invited talk, in collaboration with
                Eric Sanders)
    12:10 12:50 Validation  of  the  written  part  of  the Dutch CGN
                (provisional title, Hanne  Fers›e, invited  talk,  in
                collaboration with Sussi Olsen and Bart Jongejan)
    12:50 13:10 Quality control of treebanks: documenting, converting,
                patching (Sabine Buchholz, Darren Green)
    13:10 13:30 Evaluation of a diachronic text corpus (Mikko Lounela)
    13:30 14:30 LUNCH
    14:30 14:50 Measuring Monolinguality (Uwe Quasthoff, Chris Biemann)
    14:50 15:10 JTaCo & SProUTomat: Automatic Evaluation and Testing of
                Multilingual Language Technology Resources and Components
                (Christian Bering and Ulrich Sch„fer)
    15:10 16:20 Panel session (Chris Cieri LDC, Chu-Ren Huang Acad. Sin.,
                Takenobu Tokunaga TIT, Khalid Choukri ELDA)  [t.b.c.]
    16:20 16:30 Winding up & Closing (Steven Krauwer and Uwe Quasthoff)
    16:30 17:00 BREAK

Workshop description

  The workshop aims at
  * bringing together experience with and insights in quality
    assurance and measurement for language and speech resources in
    a broad sense (including multimodal resources, annotations,
    tools, etc),
  * covering both qualitative and quantitative aspects,
  * identifying the main tools and strategies,
  * analysing the strengths and weaknesses of current practice,
  * establishing what can be seen as current best practice,
  * reflecting on trends and future needs.

  It can be seen as a follow-up of the workshop on speech
  resources that took place at LREC 2004, but the scope is wider
  as we include both language and speech resources. We feel that
  there is a lot to be gained by bringing these communities
  together, if only because the speech community seems to have a
  longer tradition in resources evaluation than the written
  language community.

Relevance

  Quality assurance is an important concern for both the provider,
the distributor and the user of language and speech resources.
The concept of quality is only meaningful if both the producer
and the user of the resources can rely on the same set of quality
criteria, and if there are effective procedures to check whether
these criteria are met. The universe of possible types of
language resources is huge and evolves over time, and there is no
universal set of qualitative or quantitative criteria and tests
that can be applied to all sorts of resources. In this workshop
we will try to investigate what sorts of criteria, tests and
measures are being used by providers, users and distribution
agencies such as ELRA and LDC, and we will try to distill from
this current practice general recommendations for quality
assurance and measurement for language and speech resources, The
workshop will look at quality assurance and quality measures both
from the provider, the distributor and the user point of view,
and will explicitly address special problems in connection with
very large corpora, including numerical measures, comparison of
corpora, exchange formats, etc.


Workshop committee

Co-chairs:

  * Steven Krauwer (UU/ELSNET, steven.krauwer at let.uu.nl)
  * Uwe Quasthoff (Leipzig, quasthoff at informatik.uni-leipzig.de)

Members:

  * Simo Goddijn (INL, goddijn at inl.nl)
  * Jan Odijk (ELRA/Nuance/UU, jan.odijk at nuance.com)
  * Khalid Choukri (ELDA, choukri at elda.org)
  * Nicoletta Calzolari (ILC-CNR/WRITE, glottolo at ilc.cnr.it)
  * Bente Maegaard (CST, bente at cst.dk)
  * Chris Cieri (LDC, ccieri at ldc.upenn.edu)
  * Chu-ren Huang (Ac Sin, churen at gate.sinica.edu.tw)
  * Takenobu Tokunaga (TIT, take at cl.cs.titech.ac.jp)
  * Harald Hoege (Siemens, harald.hoege at siemens.com)
  * Henk van den Heuvel (CLST/SPEX, H.vandenHeuvel at let.ru.nl)
  * Dafydd Gibbon (Bielefeld, gibbon at spectrum.uni-bielefeld.de)
  * Key-Sun.Choi (KORTERM, Key-Sun.Choi at kaist.ac.kr)
  * Jorg Asmussen, (DSL, ja at dsl.dk)

Main contact and further info

  * Contact: Steven Krauwer, steven.krauwer at let.uu.nl
  * Workshop URL: http://utrecht.elsnet.org/lrec2006qa
  * Conference URL: http://www.lrec-conf.org/lrec2006

Sponsors

  This  workshop  is  supported  by  ELSNET and WRITE (the
  international coordination committee for written language resources
  and evaluation).

-- 
______________________________________________________________________
Steven Krauwer, ELSNET / UiL OTS, Trans 10, 3512 JK Utrecht, Nederland
phone: +31 30 2536050, fax: +31 30 2536000, email: s.krauwer at let.uu.nl
                     http://www-sk.let.uu.nl



More information about the Corpora mailing list