[Corpora-List] Call for Participation: ACL workshop on Discourse Annotation

Donna Byron dbyron at cis.ohio-state.edu
Thu Jun 24 04:43:40 UTC 2004


***** Call for Participation *****


Discourse Annotation
A Workshop in conjunction with ACL'04 in Barcelona, Spain

- ----------------------------
Workshop date:  July 25-26, 2004
Workshop website: http://www.cllt.osu.edu/dbyron/acl04
- ----------------------------

To register for the workshop, go to http://www.acl2004.org/

WORKSHOP OVERVIEW:

Advances in language technology draw on a combination of annotated
empirical data and linguistic theory. The richer the annotation, the
more that can potentially be learned and applied to unseen data.
Thus the Penn TreeBank (PTB), with its part-of-speech (POS) tags
and syntactic annotation, has been more useful than corpora annotated
for POS-tags alone, and PropBank, in which PTB is annotated with
predicate-argument relations, will be useful for more applications
than the PTB alone.

Two gross features of PTB and PropBank are that they annotate
sentence/clause-level features and that they were undertaken
with communal agreement (albeit somewhat contentious at first).
Similar, largely communal projects have been undertaken for
dialogue annotation, including MATE (now NITE).

Discourse annotation (in contrast with sentence-level annotation) has
taken a somewhat different course. While an early communal effort
(DRI) to annotate discourse structure according to a consensus
framework failed to achieve its goal, researchers remained convinced
of the value of discourse annotated corpora.

The current workshop reflects renewed interest in producing
gold standard corpora for the study and exploration of discourse
phenomena. It also recognizes growing interest in resources that
integrate various types of annotation in order to better characterize
discourse phenomena and thereby facilitate more sensitive and robust
tools for dealing with discourse. The workshop is neutral as to
whether consensus annotation is possible for every type of discourse
phenomenon. Its aims are rather to:

  - bring a fuller range of discourse annotation activity to the
    attention of researchers working on discourse phenomena and their
    usefulness for language technologies;

  - highlight tools used in the annotation process or used to
    display, query or further analyse the results of annotation;

  - discuss obstacles to discourse-level annotation, such as the
    greater subjectivity in bracketting and labelling judgments, the
    ambiguity of discourse phenomena, and data sparseness;

  - discuss where and how automated and semi-automated annotation
   can effectively augment or complement manual gold standard
   annotation;

  - discuss opportunities for creating a
    significantly large, reusable corpus (or set of corpora) annotated
    for multiple discourse and sentence-level phenomena, as a much
    richer basis for both assessing theories and building better tools.

The papers selected for the workshop demonstrate a range of discourse
phenomena being annotated today, tools being used to create and query
annotation, and conclusions being drawn from both manually and
automatically annotated corpora. The workshop itself will include paper
presentations, software demonstrations, and moderated discussions.

Workshop Program:

Sunday, July 25

8:45-9:00  Welcome

9:00-9:20  Text Type Structure and Logical Document Structure
            Hagen  Langer, Harald Lungen,  Petra Saskia Bayerl

9:20-9:40  Discourse Annotation and Semantic Annotation in the GNOME Corpus
            Massimo Poesio

9:40-10:00 Discourse Annotation in the Monroe Corpus
            Joel Tetreault, Mary Swift, Preethum Prithviraj,
	   Myroslava Dzikovska,  James Allen

10:30-10:50 Sentential Structure and Discourse Parsing
             Livia Polanyi, Chris Culy, Martin van den Berg,
	    Gian Lorenzo Thione, David Ahn

10:50-11:10 Annotation and Data Mining of the Penn Discourse TreeBank
             Rashmi Prasad, Eleni Miltsakaki, Aravind Joshi, Bonnie Webber

11:10-11:30 The Potsdam Commentary Corpus
             Manfred Stede

11:30-12:00 Moderated Discussion: Integrating multiple
             levels of annotation; Producing annotation for re-use

13:30-14:10 Short Descriptions of Demo Software

14:10-15:20 Demos of corpus annotation tools

    LiveTree: An Integrated Workbench for Discourse Processing
          Gian Lorenzo Thione, Martin van den Berg, Chris Culy,
          Livia Polanyi
    Practical Multi-Level Stand-Off Annotation: The MMAX2 Approach
          Christoph Muller, Michael Strube
    WordFreak: A Tool for Annotating Discourse Connectives and their
    Argument Structure
          Jeremy LaCivita, Thomas Morton, Nikhil Dinesh, Rashmi Prasad,
          Eleni Miltsakaki, Aravind Joshi, Bonnie Webber
    The ConAno Annotation and Query Tool
          Manfred Stede

15:50-16:20 Using a Probabilistic Model of Discourse Relations to
             Investigate Word Order Variation
             Cassandre Creswell

16:20-16:40 On the Use of Automatic Tools for Large-scale Semantic
             Analyses of Causal Connectives
             Liesbeth Degand, Wilbert Spooren,  Yves Bestgen

16:40-17:00 Discourse-level Annotation for Investigating Information Structure
             Ivana Kruijff-Korbayova,  Geert-Jan M. Kruijff

17:00-17:20 Exploiting Semantic Information for Manual Anaphoric Annotation
             in Cast3LB Corpus
             Borja Navarro, Ruben Izquierdo,  Maximiliano Saiz-Noeda

17:20-18:00  Moderated Discussion: Automatable approximations to
              Gold Standard annotation; Multi-lingual discourse annotation

Monday, July 26

9:00-9:20 A Framework for Feature-based Description of Low level Discourse
           Laura Alonso Alemany, Ezequiel Andujar Hinojosa,
           Robert Sola Salvatierra

9:20-9:40 COOPML: Towards Annotating Cooperative Discourse
           Farah Benamara, Veronique Moriceau,  Patrick Saint-Dizier

9:40-10:00 Korean Null Pronouns: Classification and Annotation
            Na-Rae Han

10:30-10:50 Temporal Discourse Models for Narrative Structure
             Inderjeet Mani, James Pustejovsky

10:50-11:10 Animacy Encoding in English: Why and How
             Annie Zaenen, Jean Carletta, Gregory Garretson, Joan Bresnan,
             Andrew Koontz-Garboden, Tatiana Nikitina, M. Catherine O'Connor,
             Tom Wasow

11:10-11:45 Moderated Discussion: Subjectivity in manual discourse
             annotation; Data sparsity problems in manual annotation

11:45-12:00 Wrap-up

----------------------------

PROGRAMME COMMITTEE

Bonnie Webber, University of Edinburgh (co-chair)
Donna Byron, Ohio State University (co-chair)

Steven Bird, Melbourne University
Liesbeth Degand, University of Louvain
Eva Hajicova, Charles University
Aravind Joshi, University of Pennsylvania
Andrew Kehler, UC San Diego
Daniel Marcu, ISI
Katja Markert,  Leeds University
Malvina Nissim, Edinburgh University
Livia Polanyi, FXPAL
Frank Schilder, Thomson Legal and Regulatory
Andrea Setzer, Sheffield University
Wilbert Spooren, Free University of Amsterdam
Manfred Stede, University of Potsdam
Michael Strube, EML Research, Heidelberg
Martin van den Berg, FXPAL
Annie Zaenen, PARC

----------------------------

CONTACT INFORMATION:

Professor Bonnie Webber
School of Informatics
University of Edinburgh
2 Buccleuch Place
Edinburgh EH8 9LW
UK
email: bonnie at inf.ed.ac.uk
phone: +44 131 650 4190
fax: +44 131 650 4587

Professor Donna Byron
Department of Computer and Information Science
The Ohio State University
395 Dreese Laboratory
2015 Neil Avenue
Columbus, Ohio   43210
USA
email: dbyron at cis.ohio-state.edu
phone: 614-292-6370
fax: 614-292-2911


--
Dr. Donna K. Byron
Assistant Professor
OSU Computer Science and Engineering
Ph:   614-292-6370  Fax  614-292-2911
Website:  www.cis.ohio-state.edu/~dbyron



More information about the Corpora mailing list