[Corpora-List] CFP: Workshop on Linguistic Annotation (ACL2007)

Nancy Ide ide at cs.vassar.edu
Thu Jan 25 17:36:05 UTC 2007


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

                       C A L L   F O R   P A P E R S

                     The Linguistic Annotation Workshop
                                (The LAW)
                    A Merger of NLPXML 2007 and FLAC 2007

                                ACL 2007
                         Prague, Czech Republic
                            June 28-29, 2007

Linguistically annotated corpora play a major role in parsing,
information extraction, question answering, machine translation and
many other areas of computational linguistics, and provide an
empirical testbed for theoretical linguistics research.  This has led
to a proliferation of annotation systems, frameworks, formats, and
schemes.  Recognition of the need to harmonize annotation practices
and frameworks has become increasingly critical, as witnessed by
numerous workshops dealing with different aspects of linguistic
annotation over the past few years.

The Linguistic Annotation Workshop (The LAW) will provide the first
single forum for consideration of these different aspects by merging
NLPXML: Natural Language Processing and XML
(http://www.ling.helsinki.fi/~gwilcock/NLPXML/) and FLAC: Frontiers in
Linguistically Annotated Corpora
(http://www.cs.mu.oz.au/~tim/events/frontiers2006/), which is itself a
merger of Linguistically Interpreted Corpora (LINC) and Frontiers
in Corpus Annotation (FCA). In total, the LAW will be the convergence
of 14 previous workshops (5 NLPXML, 1 FLAC, 6 LINC and 2 FCA).

The goals of this workshop include:

(1) The exchange and propagation of research results with respect
to the annotation, manipulation and exploitation of corpora, taking   
into
account different applications and theoretical investigations in the
field of language technology and research;

(2) Working towards the harmonization and interoperability from the
perspective of the increasingly large number of tools and frameworks
that support the creation, instantiation, manipulation, querying, and
exploitation of annotated resources;

(3) Working towards a consensus on all issues crucial to the   
advancement
of the field of corpus annotation.

The workshop will include presentations of long (8 page) and short (4
page) papers, demonstrations of annotation tools and invited
presentations by "working groups", as discussed below, followed by an
open discussion. Long papers should reflect work in an advanced state,
but short papers may describe more preliminary work and pilot studies.
Papers topics may cover any aspect of linguistic annotation including:

1.  New and innovative annotation schemes
2.  Machine learning and knowledge-based methods for
     automation of corpus annotation
3.  Linguistic considerations for merging of annotation of distinct   
phenomena
4.  Comparison of annotation schemes
5.  Evaluation considerations for corpus annotation
6.  Comparison and/or evaluation of existing annotation systems,
     including functionality, common/missing features, accommodation of
     different input/output formats and resource types
     (lexicons,knowledge bases, ontologies, etc.)
7.  Creation, maintenance, and interactive exploration of annotation
     structures and annotated data
8.  Representation formats/structures for merged annotations of
     different phenomena, and means to explore/manipulate them
9.  Assessment of, and potential means to achieve, interoperability of
     annotation formats/frameworks among different systems as well as
     different tasks, frameworks, modalities, and languages

The workshop will also include a one-hour demonstration session for
annotation systems and tools. Proposals for system demonstrations
should follow the short paper submission format. The proposal should
provide an overview of the system to be demonstrated, including
functionality, supported input/output formats or structures, supported
languages and modalities, etc. Accepted proposals will appear in the
proceedings and are intended to provide background for the
demonstration.

In addition to paper presentations and software demos, there will be a
few invited "working group" presentations, each laying out the
dimensions of some crucial problem facing the field of corpus
annotation, particularly problems involving merging annotation and
extending annotation to new languages, genres and modalities. The
final list of working group topics will appear on the workshop website
by February 15, 2007. Our preliminary topics include: (a)
selection of diverse or balanced corpora with few licensing
restrictions for common annotation by the community. Possible corpora
include the "open" portion of the American National Corpus and
Wikipedia XML, a freely available cleaned-up corpus that is derived
from the Wikipedia.); (b) approaches to discourse coherence,
especially as resulting from different interacting annotation layers,
and its applications to computational linguistics; and (c) annotation
systems/frameworks and interoperability, including the feasibility of
applying a common annotation framework to various annotation types,
language processing tasks, modalities, and languages, especially as it
could enable the merging of annotations of diverse phenomena produced
by different systems. We will attempt to lay out clearly and precisely
the assumptions on such topics held by members of the annotation
community and in doing so, we hope to both: (1) lay the foundations
for the meaningful integration of annotation resources; and (2) assess
the limitations of integrated approaches.

We will also be giving an Innovative Student Annotation Award to one
student presenter -- please indicate if your paper is written by
students or has one or more student authors. This includes waiving of
the workshop fee for one student.


WORKSHOP WEBSITE: http://www.ling.uni-potsdam.de/acl-lab/LAW-07.html

TARGET AUDIENCE: Those interested in creating and using existing and
future annotated corpora and other language resources. This includes
annotators, lexicographers, system developers and those designing NLP
system evaluation tasks for the NLP community.

SUBMISSIONS

Long paper submissions should not exceed 8 pages in length and short
papers and demo descriptions should not exceed 4 pages. Format
requirements will be the same as for full papers of ACL 2007. See
http://ufal.mff.cuni.cz/acl2007/ for style files.

For details of the submission procedure, please consult the submission
webpage reachable via the workshop website.

Please indicate:

1) long paper, short paper or demonstration proposal;

2) all applicable paper categories from the following list (indicate
    multiple categories if appropriate): annotation frameworks and/or
    physical formats, annotation scheme design (on linguistic grounds),
    annotation tools and systems, corpus annotation, syntax, semantics,
    predicate-argument structure, morphology, anaphora, discourse,
    opinion/sentiment;

3) language(s) your work applies to, as well and those you plan to
    handle in the future. If your work is language independent,
    indicate this as well.

4) any non-standard equipment needed for your paper or demonstration

LANGUAGE: All papers must be written and presented in English

IMPORTANT DATES

Papers due:  March 26, 2007
Acceptance/rejection notification:  April 24, 2007
Final version due: May 9, 2007
Workshop Dates: June 28-29, 2007

Co-Chairs:

Branimir Boguraev, IBM T. J. Watson Research Center, USA
Nancy Ide, Vassar College, USA
Adam Meyers, New York University, USA
Shigeko Nariyama, University of Melbourne, Australia
Manfred Stede, University of Potsdam, Germany
Janyce Wiebe, University of Pittsburgh, USA
Graham Wilcock, University of Helsinki, Finland


Program Committee:

David Ahn (University of Amsterdam, NL)
Lars Ahrenberg (Linköpings Universitet, Sweden)
Timothy Baldwin (University of Melbourne, Australia)
Francis Bond (NICT, Japan)
Kalina Bontcheva (University of Sheffield, UK)
Paul Buitelaar (DFKI, Germany)
Jean Carletta (University of Edinburgh, UK)
Key-Sun Choi (KAIST, Korea)
Chris Cieri (Linguistic Data Consortium/University of Pennsylvania,   
USA)
Hamish Cunningham (University of Sheffield, UK)
David Day (MITRE Corporation, USA)
Thierry Declerck (DFKI, Germany)
Ludovic Denoyer (University of Paris, France)
Tomaz Erjavec (Institute Josef Stefan, Slovenia)
David Farwell (Computing Research Laboratory, USA)
Alex Chengyu Fang (City University Hong Kong, China)
Chuck Fillmore (International Computer Science Institute, Berkeley, USA
Anette Frank (DFKI, Germany)
John Fry (SRI International, USA)
Claire Grover (University of Edinburgh, UK)
Jan Hajic (Charles University, Czech Republic)
Ed Hovy (International Sciences Institute, USA)
Baden Hughes (University of Melbourne, Australia)
Emi Izumi (NICT, Japan)
Tsai Jia-Lin (Tung Nan Institute of Technology, China)
Aravind Joshi (University of Pennsylvania, USA)
Ewan Klein (University of Edinburgh, UK)
Mounia Lalmas (University of London, UK)
Mike Maxwell (University of Maryland, USA)
Chieko Nakabasami (Toyo University, Japan)
Stephan Oepen (University of Oslo, NO)
Kyonghee Paik (KLI)
Martha Palmer (University of Colorado, USA)
Antonio Pareja-Lora (UCM, Spain)
Manfred Pinkal (DFKI, Germany)
James Pustejovsky (Brandeis University, USA)
Owen Rambow (Columbia University)
Laurent Romary (Loria/CNRS, France)
Henry Thompson (University of Edinburgh, UK)
Erik Tjong Kim Sang (University of Amsterdam, NL)
Theresa Wilson (University of Pittsburgh, USA)
Nainwen Xue (University of Pennsylvania, USA)

Please refer all questions to:

Nancy Ide (ide at cs.vassar.edu)
or
Shigeko Nariyama (shigeko at unimelb.edu.au)



More information about the Corpora mailing list