Appel: COLING-2000 Workshop

Philippe Blache pb at
Sat Apr 8 09:57:45 UTC 2000

From: Remi Zajac <rzajac at>

Call for submissions for the

COLING-2000 Workshop on Using Toolsets and Architectures To Build NLP

Centre Universitaire, Luxembourg, 5 August 2000

(see also this call at


The purpose of the workshop is to present the state-of-the-art on NLP
toolsets and workbenches that can be used to develop multilingual
and/or multi-applications NLP components and systems. Although
technical presentations of particular toolsets are of interest, we
would like to emphasize methodologies and practical experiences in
building components or full applications using an NLP
toolset. Combined demonstrations and paper presentations are strongly

Many toolsets have been developed to support the implementation of
single NLP components (taggers, parsers, generators, dictionaries) or
complete Natural Language Processing applications (Information
Extraction systems, Machine Translation systems).  These tools aim at
facilitating and lowering the cost of building NLP systems. Since the
tools themselves are often complex pieces of software, they require a
significant amount of effort to be developed and maintained in the
first place. Is this effort worth the trouble?  It is to be noted that
NLP toolsets have often been originally developed for implementing a
single component or application. In this case, why not build the NLP
system using a general programming language such as Lisp or Prolog?
There can be at least two answers. First, for pure efficiency issues
(speed and space), it is often preferable to build a parameterized
algorithm operating on a uniform data structure (e.g., a
phrase-structure parser). Second, it is harder, and often impossible,
to develop, debug and maintain a large NLP system directly written in
a general programming language.

It has been the experience of many users that a given toolset is quite
often unusable outside its environment: the toolset can be too
restricted in its purpose (e.g. an MT toolset that cannot be used for
building a grammar checker), too complex to use, or even too difficult
to install. There have been, in particular in the US under the Tipster
program, efforts to promote instead common architectures for a given
set of applications (primarily IR and IE in Tipster; see also the
Galaxy architecture of the DARPA Communicator project). Several
software environments have been built around this flexible concept,
which is closer to current trends in main stream software engineering.

The workshop aims at providing a picture of the current problems faced
by developers and users of toolsets, and future directions for the
development and use of NLP toolsets. We encourage reports of actual
experiences in the use of toolsets (complexity, training, learning
curve, cost, benefits, user profiles) as well as presentation of
toolsets concentrating on user issues (GUIs, methodologies, on-line
help, etc.)  and application development. Demonstrations are also


Researchers and practitioners in Language Engineering, users and
developers of tools and toolsets.


Although individual tools (such as a POS taggers) have their use, they
typically need to be integrated in a complete application (e.g. an IR
system). Language Engineering issues in toolset and architectures
include (in no particular order):

  Practical experience in the use of a toolset;
  Methodological issues associated to the use of a toolset;
  Benefits and deficiencies of toolsets;
  User (linguist/programmer) training and support;
  Adaptation of a tool (or toolset) to a new kind of application;
  Adaptation of a tool to a new language;
  Integration of a tool in an application;
  Architectures and support software;
  Reuse of data resources vs. processing components;
  NLP algorithmic libraries.

Format of the Workshop

The one-day workshop will include twelve presentation periods which
will be divided into 20 minutes presentations followed by 10 minutes
reserved for exchanges. We encourage the authors to focus on the
salient points of their presentation and identify possible
controversial positions.  There will be ample time set aside for
informal and panel discussions and audience participation. Please note
that workshop participants are required to register at


   21 May 2000: Submission deadline.
   11 June 2000: Notification to authors.
   24 June 2000: Final camera-ready copy.
   5 August 2000: COLING-2000 Workshop.

Submission Format

Send submissions of no more than 6 pages conforming to the COLING
format ( to zajac at
We prefer electronic submissions using either PDF or Postscript.
Final submissions can extend to 10 pages.

Organizing Committee

  Rémi Zajac (Chair), CRL, New-Mexico State University, USA:
       zajac at
  Jan Amtrup, CRL, New-Mexico State University, USA:
      jamtrup at
  Stephan Busemann, DFKI, Saarbrucken:
       busemann at
  Hamish Cunningham, University of Sheffield:
      hamish at
  Guenther Goerz, IMMD VIII, University of Erlangen:
      goerz at
  Gertjan van Noord, University of Groningen:
      vannoord at
  Fabio Pianesi, IRST, Trento:
      pianesi at

Of Related Interest

  The Natural Language Software Registry at
  The Coling-200 Web Site at

Message diffusé par la liste Langage Naturel <LN at>
Informations, abonnement :
English version          :
Archives                 :

More information about the Ln mailing list