ACL-08 Workshop on Parsing German

Gerald Penn gpenn at CS.TORONTO.EDU
Tue Feb 5 03:04:38 UTC 2008


[Please distribute widely]

   ACL 2008
   WORKSHOP ON PARSING GERMAN
     June 19 or 20, 2008
        Columbus, Ohio

    1st CALL FOR PAPERS

German possesses an interesting set of configurational properties on
the syntactic level which make it far less flexible with respect to
word order than other free word order languages.  Analyses of these
properties, which have formed a part of the traditional syntax of
German since the early 19th century, only re-entered the mainstream of
generative linguistics research within the last twenty years or so.
In computational linguistics, however, their realization has varied
quite widely: "topological fields" in HPSG-style analyses, multiple
parse trees, special constraints on liberation in constraint-based
dependency-style analyses, various hybrid "deep/shallow" approaches,
and agnostic parameter estimation over graphs.  This variation can
also acutely be felt in the annotation of German treebanks.  Many
corpora have historically elected to annotate only a few of the
different senses of the term "constituent" inherent to German syntax,
resulting in standards that make German appear either more like
English or more like Czech.

The aim of this workshop is to provide a forum for theoretical
discussion as well as a shared task, based on the TIGER and TueBa-D/Z
German treebanks, for these various approaches to make their case on
empirical grounds.  This combination we believe to be essential to
balancing the considerations of what structure merits learning versus
the ease with which it can be learned.  Both treebanks are annotated
collections of German newspaper text on similar topics. They are
annotated with POS, morphology, phrase structure, and grammatical
functions. TueBa-D/Z additionally uses topological fields to describe
fundamental word order restrictions in German clauses.  The treebanks
differ significantly in their annotation schemes, however: while TIGER
relies on crossing branches to describe long distance relationships,
TueBa-D/Z uses pure tree structures with designated labels for long
distance relationships. Additionally, the annotation is TIGER is flat
on the phrasal level while TueBa-D/Z annotates phrasal structure more
hierarchically.

TOPICS

* constituent based approaches to parsing German
* dependency based approaches to parsing German
* treatment of long-distance relationships in German
* comparisons of parsing results for German to other free word order languages

SHARED TASK

The workshop will feature a shared task on parsing German. We will
provide the following data sets:

* TIGER in constituent structure
* TIGER in dependency structure
* TueBa-D/Z in constituent structure
* TueBa-D/Z in dependency structure

The task will be to parse both treebanks using one structural
encoding. The final ranking of systems will be based on averages
computed between both treebanks. The data sets will be made available
free of charge for the shared task, but they do require a license.

In order to take part in the shared task, participants should register
their intent to participate by sending an email to
skuebler at indiana.edu. More information will be made available to
registered participants.

IMPORTANT DATES

Release of training data:    February 5, 2008
Release of test data:     March 5, 2008
Submission of test results:                 March 10, 2008
Evaluation results available:               March 12, 2008

Workshop Paper Submission deadline:  March 17, 2008
Notifications sent to authors:             April  4, 2008
Camera ready due:      April 18, 2008
Workshop Dates:                            June 19 or 20, 2008

PAPER SUBMISSION INFORMATION

Submissions will consist of regular full papers of max. 8 pages,
formatted following the ACL 2008 main session guidelines. In addition,
shared task participants will be invited to submit short papers
(max. 4 pages) describing their systems and/or their evaluation
metrics. Both submission and review processes will be handled via the
START system.

PROGRAM COMMITTEE

Berthold Crysman, Bonn
Amit Dubey, Edinburgh
Anette Frank, Heidelberg
Erhard Hinrichs, Tuebingen
Julia Hockenmeier, Illinois
Laura Kallmeyer, Tuebingen
Frank Keller, Edinburgh
Sandra Kuebler (co-chair)
Wolfgang Menzel, Hamburg 
Stefan Mueller, Berlin
Stefan Oepen, Oslo
Gerald Penn (co-chair)
Helmut Schmid, Stuttgart
Gerold Schneider, Zuerich
Hans Uszkoreit, Saarbruecken
Josef van Genabith, Dublin

WORKSHOP ORGANIZERS

Sandra Kuebler     
Indiana University
skuebler at indiana.edu

Gerald Penn
University of Toronto
gpenn at cs.toronto.edu



More information about the LFG mailing list