Date:  Wed, 17 Dec 2003 15:35:43 -0500 (EST)
From:  andreas.witt at uni-bielefeld.de
Subject:  XML-Based Richly Annotated Corpora

Date:  Thu, 18 Dec 2003 11:25:54 -0500 (EST)
From:  ikedat at mail.utexas.edu
Subject:  Symposium About Language and Society--Austin

XML-Based Richly Annotated Corpora
Short Title: Xbrac

Date: 29-May-2004 - 29-May-2004
Location: Lisbon, Portugal
Contact: Andreas Witt
Contact Email: andreas.witt at uni-bielefeld.de
Meeting URL: http://coli.lili.uni-bielefeld.de/forschung/xbrac/

Linguistic Sub-field: Text/Corpus Linguistics
Call Deadline: 15-Feb-2004

Meeting Description:

The workshop aims at bringing together XML experts, both theorists and
practitioners, as well as linguists and natural interactivity
researchers working on the definition of corpus architectures,
annotation and resource exchange schemes and on tools for the use of
multilevel and/or multi-layer annotated corpora. It will provide a
forum for the definition of requirements for corpus representations
and pertaining tools, discussing at the same time case studies from
linguistics and natural interactivity research. XML has become a de
facto standard for the representation of corpus resources. It is being
used for representing speech and text corpora, multimodal and
multimedial corpora, as well as, in particular, integrated corpora
which combine different modalities. XML-based representations make it
easier to work with richly annotated corpora, which include
annotations from different levels of linguistic description or from
different modalities. A number of tools have also become available,
over the last few years, for creating, managing, annotating, querying
such corpora and for their statistical exploration.

Although XML is a useful representation language, its use alone does
not solve all the problems and choices with respect to the
representation style (e.g. stand-off annotations vs. embedded
annotations); these are in turn closely linked with questions of the
architecture of richly annotated corpora, such as the following:
should information from different levels of linguistic description be
represented in separate ''layers'' of the annotation? Should a given
information type serve as a grounding for all or some of the others?
How to account for interdependencies and interaction between phenomena
from different levels of description? How to account for concurrent
annotation (one phenomenon, different analyses or

Such questions and the pertaining corpus-architectural considerations
interact with at least two more problem areas: on the one hand with
the kinds of research questions and of phenomena to be analysed in
linguistic and natural interaction research (which may call for
certain architectural solutions), and on the other hand with tools for
the creation, annotation, manipulation and exploration of XML-based

The workshop will attempt to address the interplay between the
following research areas:

   1. XML techniques for corpus representation, i.e. :
          * Standoff annotation vs. embedded annotation;
          * Use of XML linking standards for language data (XLink,
XPointer, XPath); other ways of ensuring relationships between levels,
e.g. through naming conventions;
          * Concepts of layering in corpora annotated at several
levels of linguistic description; types of information grouped
together vs.  distributed over different ''packages''
          * Hierarchical vs. flat annotation;
          * the grounding of annotations (e.g. in XML elements vs. in
characters?) and its implications;
          * techniques for the manipulation of XML-based
representations for massively annotated corpora; usefulness and
relevance of XQuery.

   2. Levels of linguistic description and their interaction, i.e.:
          * Examples of richly annotated corpora: reasons for the
choice of the annotated levels; linguistic and natural interactivity
research questions which can (only) be solved with richly annotated
          * Interaction between levels: new research questions in
linguistics and natural interactivity research which can only be
addressed because of observation across levels, across modalities,
etc. An example is the use of clustering techniques across different
levels: e.g. relevant cooccurrences of phenomena from different levels
identified via clustering;
          * Use and usefulness of concurrent annotations in XML-based
corpora; an example is concurrent flat and deep syntactic analysis.

   3. Tools for handling richly annotated corpora: Software solutions
for, e.g.,
          * corpus creation, transformation, exchange, and validation
          * interactive annotation;
          * exploration: query and retrieval, statistical analysis;
          * corpus management (e.g. wrt. meta-data).

Tools presented should be positioned with respect to the questions of
corpus architecture and with respect to the research directions
discussed above under (1) and (2).

    * Andreas Witt, Bielefeld University
    * Ulrich Heid, University of Stuttgart
    * Henry S. Thompson, University of Edinburgh
    * Jean Carletta, University of Edinburgh
    * Peter Wittenburg, MPI for Psycholinguistics Nijmegen

Program committee

    * Jean Carletta, University of Edinburgh, UK
    * Ulrich Heid, University of Stuttgart, Germany
    * Henning Lobin, Justus-Liebig-Universität Gieen, Germany
    * Dieter Metzing, Bielefeld University, Germany
    * Joakim Nivre, Växjö University, Sweden
    * Vito Pirrelli, Istituto di Linguistica Computazionale del CNR,
Pisa, Italy
    * Gary Simons, SIL International, Taxas, USA
    * Henry S. Thompson, University of Edinburgh, UK
    * Jun'ichi Tsujii, University of Tokyo, Japan
    * Andreas Witt, Bielefeld University, Germany
    * Peter Wittenburg, MPI for Psycholinguistics Nijmegen, Netherlands

Authors are invited to submit papers for oral presentation in any of
the areas listed above. Only full papers will be accepted, and the
length of the paper should not exceed 8 pages.

Requirements for Paper Submission:

    * Submissions must be full papers, not extended abstracts.
    * It is highly recommendedauthors submit papers in the LREC-conference
      proceedings format (maximum of 8 pages).
    * Submission in other formats will be accepted (font sizes of 11
or 12 point), however they can be no longer than eight (8) pages
including figures, tables, and references, formatted for A4-paper with
reasonable margins.
    * Electronic submission of manuscripts (details in the submission
site) is required (PDF preferred, Postscript, and ASCII accepted).

    * An additional title page should include the title, author(s),
      affiliation(s), contact email address, postal address,
      telephone, fax and URL as well as five keywords.

Submission should be sent by email, to andreas.witt at uni-bielefeld.de
before 15th February 2004.

Symposium About Language and Society--Austin
Short Title: SALSA

Date: 16-Apr-2004 - 18-Apr-2004
Location: Austin, Texas, United States of America
Contact: SALSA Organizers
Contact Email: utsalsa at uts.cc.utexas.edu
Meeting URL: http://www.utexas.edu/students/salsa/index.shtml

Linguistic Sub-field: General Linguistics
Call Deadline: 15-Jan-2004

Meeting Description:

Established in 1992, SALSA is a student-organized annual conference
sponsored by the departments of Linguistics, Anthropology, and
Communication Studies at The University of Texas at Austin. It has
become internationally recognized by graduate students and faculty
alike as a prestigious, interdisciplinary venue for presenting
cutting-edge work on the relationship of language and culture to
society. 2004 keynoe speakers are Susan Ervin-Tripp (University of
California, Berkeley), Emanuel Schegloff (University of California,
Los Angeles), Jurgen Streeck (University of Texas, Austin), and
Stanton Wortham, University of Pennsylvania). The SYMPOSIUM ABOUT
LANGUAGE AND SOCIETY--AUSTIN is pleased to announce its 12th annual
meeting to be held APRIL 16-18, 2004, at the University of Texas at
Austin. ï¾S^We encourage the submission of abstracts on research that
addresses the relationship of language to culture and society.
Desired frameworks include but are not limited to:

Linguistic Anthropology
Ethnography of Communication
Language and Identity
Speech Play, Verbal Art, and Poetics
Language, Media, and Technology
Language and Social Interaction
Discourse Analysis
Conversation Analysis
Language Vitality
Language Socialization
Gesture and Talk in Interaction

Papers delivered at the conference will be published as a special
edition of the Texas Linguistic Forum. Speakers will be allowed 20
minutes for presentation and 10 minutes for discussion. Papers will be
selected based on the evaluation of an anonymously written abstract,
which may not exceed 600 words. We will accept only electronic

Please email your abstract to utsalsa at uts.cc.utexas.edu,
Subject: SALSA 12 Abstract.
Please include in the following order:
1. Title of the paper
2. Author's name
3. Author's affiliation
4. Address, phone number, and email address at which the author wishes
to be notified
5. A 600-word abstract*
6. A short 200-word abstract* for publication in the conference
7. Equipment needs (e.g., overhead projector, computer projection,

*Please send the abstracts as a Word attachment AND in the body of the
email message.

8. Name your 600-word abstract file as ''LASTNAME.FIRSTINITIAL.LONG''
   (e.g. ''BROWN.J.LONG'') and your 200-word abstract file as

Visit the SALSA web page for submission guidelines and conference
details: http://www.utexas.edu/students/salsa/index.shtml

Deadline for receipt of abstracts is JANUARY 15, 2004. Late
submissions will not be accepted, and we cannot accept papers that are
to be published elsewhere. Notification of acceptance or rejection
will be sent in mid-February 2004.

Pre-registration fees will be $20 for students and $40 for
non-students, and on-site registration fees will be $25 for students
and $45 for non-students. Completed papers must be brought to the
conference to be included in the published proceedings.

