[Corpora-List] 2nd CFP for ACL-2004 Workshop: Discourse Annotation

Donna Byron dbyron at cis.ohio-state.edu
Sun Mar 7 01:14:09 UTC 2004


Second Call for Papers
Discourse Annotation
A Workshop in conjunction with ACL'04 in Barcelona, Spain

----------------------------
Workshop date:  July 25-26, 2004
Full paper submissions due: March 22, 2004
Workshop website: http://www.cllt.osu.edu/dbyron/acl04
----------------------------

WORKSHOP OVERVIEW:

Advances in language technology draw on a combination of annotated
empirical data and linguistic theory. The richer the annotation, the
more that can potentially be learned and applied to unseen data.
Thus the Penn TreeBank (PTB), with its part-of-speech (POS) tags
and syntactic annotation, has been more useful than corpora annotated
for POS-tags alone, and PropBank, in which PTB is annotated with
predicate-argument relations, will be useful for more applications
than the PTB alone.

Two gross features of PTB and PropBank are that they annotate
sentence/clause-level features and that they were undertaken
with communal agreement (albeit somewhat contentious at first).
Similar, largely communal projects have been undertaken for
dialogue annotation, including MATE (now NITE).

Discourse annotation (in contrast with sentence-level annotation) has
taken a somewhat different course. While an early communal effort
(DRI) to annotate discourse structure according to a consensus
framework failed to achieve its goal, recognition remained of the
value of discourse annotated corpora. The result has been that diverse
grass-roots efforts have been producing individual corpora annotated
for a wide variety of phenomena such as

   - referring/attributive expressions and coreference;
   - spatial/temporal expressions and spatial/temporal relations;
   - other anaphoric and/or elliptic expressions and their discourse
     dependencies;
   - discourse units and their relations to one another;
   - information structure themes and the themes/rhemes that license
     them;
   - discourse connectives and what they connect;
   - contexts of interpretation;
   - cognitive accessibility scales (e.g. animacy);
   - types of speech (direct, indirect, free indirect).

Groups involved in these efforts appear to be using (or
planning to use) these corpora for a range of applications that
include: empirical testing of theoretical claims/hypotheses;
supporting second-language acquisition of discourse-sensitive
linguistic devices; training resolution procedures for co-referring
expressions or other anaphors, that can be used in annotating
additional texts or in supporting technologies such as information
extraction, question answering, summarization, and/or text generation;
training discourse parsers that can be used for annotating additional
texts or for reducing the amount of manual effort needed in the
process; and probabilistic sentence and text realization.

The workshop is neutral as to whether consensus annotation is possible
for every type of discourse phenomenon. Its aims are rather to:

   - bring a fuller range of discourse annotation activity to the
     attention of researchers working on discourse phenomena and their
     usefulness for language technologies;

   - highlight tools used in the annotation process or used to display
     or further analyse the results of annotation;

   - discuss obstacles to some (all?) forms of discourse-level
     annotation, such as the greater subjectivity that seems involved
     in making judgments related to, for example, bracketting and
     labelling;

   - identify gaps in this work (e.g., in the range of genre being
     annotated);

   - stimulate researchers with respect to the uses other researchers
     are putting their data to;

   - discuss (in small groups and in feedback sessions) whether we
     already have, or could together create, a significantly large,
     reusable corpus (or set of corpora) annotated for multiple
     discourse and sentence-level phenomena, as a much richer basis
     for both assessing theories and building better tools.

With these aims in mind, we solicit papers on:

   - discourse annotation projects (in any language);
   - uses made of discourse annotated corpora, alone or together
     with other forms of annotation;
   - tools for discourse annotation (e.g., for assisting manual
     annotation or for (semi-)automating the process) or for analysing
     discourse annotated data;
   - tools for integrating layers of annotation (different types of
     word-, sentence-, and discourse-level markup);
   - requirements for annotated corpora from the perspective of
     computational linguistics (e.g., vis-a-vis data sharing,
     comparison, integration/alignment, etc.)
   - experiments with integrating and exploiting different layers of
     annotation (from word to discourse level)

As well as for presentation, the papers will be used for structuring
the above-mentioned small group discussions and feedback sessions.

----------------------------

Format for Submissions

Submissions are limited to original, unpublished work. Submissions
must use the 2-column ACL latex style or Microsoft Word style (see
submission style files at http://www.acl2004.org/aclstyles/style.html).
Paper submissions should consist of a full paper (up to 8 pages in
length, including references, with a minimum font size of 10
point). Papers outside the specified length are subject to be rejected
without review. The paper should be written in English.

----------------------------

Submission Questions

Please send submission questions to the co-chairs:

     bonnie at inf.ed.ac.uk
     dbyron at cis.ohio-state.edu

----------------------------

Submission Procedure

Electronic submission only: send the pdf (preferred), postscript, or
MS Word form of your submission to: Donna Byron
(dbyron at cis.ohio-state.edu). The Subject line should be "ACL2004
WORKSHOP PAPER SUBMISSION".

N.B. If you use any special fonts, please include them with your PDF
submission. Otherwise reviewers may have unnecessary problems with
printing.

----------------------------

Deadlines:

Paper submission deadline:              Mar 22, 2004
Notification of acceptance for papers:  April 30, 2004
Camera ready papers due:                May 24, 2004
Workshop date:                           Jul 25, 2004

----------------------------

PROGRAMME COMMITTEE

Bonnie Webber, University of Edinburgh (co-chair)
Donna Byron, Ohio State University (co-chair)

Steven Bird, Melbourne University
Liesbeth Degand, University of Louvain
Eva Hajicova, Charles University
Aravind Joshi, University of Pennsylvania
Andrew Kehler, UC San Diego
Daniel Marcu, ISI
Katja Markert,  Leeds University
Malvina Nissim, Edinburgh University
Livia Polanyi, FXPAL
Frank Schilder, University of Hamburg
Andrea Setzer, Sheffield University
Wilbert Spooren, Free University of Amsterdam
Manfred Stede, University of Potsdam
Michael Strube, EML Research, Heidelberg
Martin van den Berg, FXPAL
Annie Zaenen, PARC

----------------------------

CONTACT INFORMATION:

Professor Bonnie Webber
School of Informatics
University of Edinburgh
2 Buccleuch Place
Edinburgh EH8 9LW
UK
email: bonnie at inf.ed.ac.uk
phone: +44 131 650 4190
fax: +44 131 650 4587

Professor Donna Byron
Department of Computer and Information Science
The Ohio State University
395 Dreese Laboratory
2015 Neil Avenue
Columbus, Ohio   43210
USA
email: dbyron at cis.ohio-state.edu
phone: 614-292-6370
fax: 614-292-2911






--
Dr. Donna K. Byron
Assistant Professor
OSU Computer and Information Science
Ph:   614-292-6370  Fax  614-292-2911
Website:  www.cis.ohio-state.edu/~dbyron




--
Dr. Donna K. Byron
Assistant Professor
OSU Computer Science and Engineering
Ph:   614-292-6370  Fax  614-292-2911
Website:  www.cis.ohio-state.edu/~dbyron



More information about the Corpora mailing list