[Corpora-List] 2nd CFP for ACL-2004 Workshop: Discourse Annotation
Donna Byron
dbyron at cis.ohio-state.edu
Sun Mar 7 01:14:09 UTC 2004
Second Call for Papers
Discourse Annotation
A Workshop in conjunction with ACL'04 in Barcelona, Spain
----------------------------
Workshop date: July 25-26, 2004
Full paper submissions due: March 22, 2004
Workshop website: http://www.cllt.osu.edu/dbyron/acl04
----------------------------
WORKSHOP OVERVIEW:
Advances in language technology draw on a combination of annotated
empirical data and linguistic theory. The richer the annotation, the
more that can potentially be learned and applied to unseen data.
Thus the Penn TreeBank (PTB), with its part-of-speech (POS) tags
and syntactic annotation, has been more useful than corpora annotated
for POS-tags alone, and PropBank, in which PTB is annotated with
predicate-argument relations, will be useful for more applications
than the PTB alone.
Two gross features of PTB and PropBank are that they annotate
sentence/clause-level features and that they were undertaken
with communal agreement (albeit somewhat contentious at first).
Similar, largely communal projects have been undertaken for
dialogue annotation, including MATE (now NITE).
Discourse annotation (in contrast with sentence-level annotation) has
taken a somewhat different course. While an early communal effort
(DRI) to annotate discourse structure according to a consensus
framework failed to achieve its goal, recognition remained of the
value of discourse annotated corpora. The result has been that diverse
grass-roots efforts have been producing individual corpora annotated
for a wide variety of phenomena such as
- referring/attributive expressions and coreference;
- spatial/temporal expressions and spatial/temporal relations;
- other anaphoric and/or elliptic expressions and their discourse
dependencies;
- discourse units and their relations to one another;
- information structure themes and the themes/rhemes that license
them;
- discourse connectives and what they connect;
- contexts of interpretation;
- cognitive accessibility scales (e.g. animacy);
- types of speech (direct, indirect, free indirect).
Groups involved in these efforts appear to be using (or
planning to use) these corpora for a range of applications that
include: empirical testing of theoretical claims/hypotheses;
supporting second-language acquisition of discourse-sensitive
linguistic devices; training resolution procedures for co-referring
expressions or other anaphors, that can be used in annotating
additional texts or in supporting technologies such as information
extraction, question answering, summarization, and/or text generation;
training discourse parsers that can be used for annotating additional
texts or for reducing the amount of manual effort needed in the
process; and probabilistic sentence and text realization.
The workshop is neutral as to whether consensus annotation is possible
for every type of discourse phenomenon. Its aims are rather to:
- bring a fuller range of discourse annotation activity to the
attention of researchers working on discourse phenomena and their
usefulness for language technologies;
- highlight tools used in the annotation process or used to display
or further analyse the results of annotation;
- discuss obstacles to some (all?) forms of discourse-level
annotation, such as the greater subjectivity that seems involved
in making judgments related to, for example, bracketting and
labelling;
- identify gaps in this work (e.g., in the range of genre being
annotated);
- stimulate researchers with respect to the uses other researchers
are putting their data to;
- discuss (in small groups and in feedback sessions) whether we
already have, or could together create, a significantly large,
reusable corpus (or set of corpora) annotated for multiple
discourse and sentence-level phenomena, as a much richer basis
for both assessing theories and building better tools.
With these aims in mind, we solicit papers on:
- discourse annotation projects (in any language);
- uses made of discourse annotated corpora, alone or together
with other forms of annotation;
- tools for discourse annotation (e.g., for assisting manual
annotation or for (semi-)automating the process) or for analysing
discourse annotated data;
- tools for integrating layers of annotation (different types of
word-, sentence-, and discourse-level markup);
- requirements for annotated corpora from the perspective of
computational linguistics (e.g., vis-a-vis data sharing,
comparison, integration/alignment, etc.)
- experiments with integrating and exploiting different layers of
annotation (from word to discourse level)
As well as for presentation, the papers will be used for structuring
the above-mentioned small group discussions and feedback sessions.
----------------------------
Format for Submissions
Submissions are limited to original, unpublished work. Submissions
must use the 2-column ACL latex style or Microsoft Word style (see
submission style files at http://www.acl2004.org/aclstyles/style.html).
Paper submissions should consist of a full paper (up to 8 pages in
length, including references, with a minimum font size of 10
point). Papers outside the specified length are subject to be rejected
without review. The paper should be written in English.
----------------------------
Submission Questions
Please send submission questions to the co-chairs:
bonnie at inf.ed.ac.uk
dbyron at cis.ohio-state.edu
----------------------------
Submission Procedure
Electronic submission only: send the pdf (preferred), postscript, or
MS Word form of your submission to: Donna Byron
(dbyron at cis.ohio-state.edu). The Subject line should be "ACL2004
WORKSHOP PAPER SUBMISSION".
N.B. If you use any special fonts, please include them with your PDF
submission. Otherwise reviewers may have unnecessary problems with
printing.
----------------------------
Deadlines:
Paper submission deadline: Mar 22, 2004
Notification of acceptance for papers: April 30, 2004
Camera ready papers due: May 24, 2004
Workshop date: Jul 25, 2004
----------------------------
PROGRAMME COMMITTEE
Bonnie Webber, University of Edinburgh (co-chair)
Donna Byron, Ohio State University (co-chair)
Steven Bird, Melbourne University
Liesbeth Degand, University of Louvain
Eva Hajicova, Charles University
Aravind Joshi, University of Pennsylvania
Andrew Kehler, UC San Diego
Daniel Marcu, ISI
Katja Markert, Leeds University
Malvina Nissim, Edinburgh University
Livia Polanyi, FXPAL
Frank Schilder, University of Hamburg
Andrea Setzer, Sheffield University
Wilbert Spooren, Free University of Amsterdam
Manfred Stede, University of Potsdam
Michael Strube, EML Research, Heidelberg
Martin van den Berg, FXPAL
Annie Zaenen, PARC
----------------------------
CONTACT INFORMATION:
Professor Bonnie Webber
School of Informatics
University of Edinburgh
2 Buccleuch Place
Edinburgh EH8 9LW
UK
email: bonnie at inf.ed.ac.uk
phone: +44 131 650 4190
fax: +44 131 650 4587
Professor Donna Byron
Department of Computer and Information Science
The Ohio State University
395 Dreese Laboratory
2015 Neil Avenue
Columbus, Ohio 43210
USA
email: dbyron at cis.ohio-state.edu
phone: 614-292-6370
fax: 614-292-2911
--
Dr. Donna K. Byron
Assistant Professor
OSU Computer and Information Science
Ph: 614-292-6370 Fax 614-292-2911
Website: www.cis.ohio-state.edu/~dbyron
--
Dr. Donna K. Byron
Assistant Professor
OSU Computer Science and Engineering
Ph: 614-292-6370 Fax 614-292-2911
Website: www.cis.ohio-state.edu/~dbyron
More information about the Corpora
mailing list