Corpora: Workshop: XML Technologies for Linguistic Data

Thu Feb 22 14:59:25 UTC 2001

Research Dissemination Workshop:  XML Markup Technologies
for working with Linguistic Data

Hosted by:
The Language Technology Group
University of Edinburgh
10 and 11 May 2001

The Language Technology Group, with support from EPSRC,
ESRC, the EU, and other sources, has invested substantial
effort over the last six years in building up an inventory
of tools and technologies for the markup of language data,
including complex, non-hierarchical structures using
stand-off annotation.  Our markup-based architecture for
NLP systems has been used for applications as diverse as
language corpus annotation, named entity recognition and
tokenization.  The goal of this workshop is to introduce
our work to a larger audience, and it will include:

+++ an introduction to the World Wide Web Consortium (W3C)
XML-related standards at the heart of our work;

+++ details of current markup technologies developed here
and elsewhere, including both automatic, rule-based data
transduction and hand authoring

+++ hands-on tutorials using all of the major tools needed
to run a language data project from start to finish

This workshop is aimed at language data users of every
variety, from computational linguists to corpus linguists
and text analysts.  The material presented will assume
some facility with computers, but will introduce all of
the necessary XML and data processing concepts.

The workshop will be free of charge, but participants must
register in advance.  For more information and registration,
see http://www.ltg.ed.ac.uk/xml2001