11.1648, Calls: Robust Methods/NLP, Web-Based Lang Description
The LINGUIST Network
linguist at linguistlist.org
Sat Jul 29 05:46:51 UTC 2000
LINGUIST List: Vol-11-1648. Sat Jul 29 2000. ISSN: 1068-4875.
Subject: 11.1648, Calls: Robust Methods/NLP, Web-Based Lang Description
Moderators: Anthony Rodrigues Aristar, Wayne State U.<aristar at linguistlist.org>
Helen Dry, Eastern Michigan U. <hdry at linguistlist.org>
Andrew Carnie, U. of Arizona <carnie at linguistlist.org>
Reviews: Andrew Carnie: U. of Arizona <carnie at linguistlist.org>
Associate Editors: Ljuba Veselinova, Stockholm U. <ljuba at linguistlist.org>
Scott Fults, E. Michigan U. <scott at linguistlist.org>
Jody Huellmantel, Wayne State U. <jody at linguistlist.org>
Karen Milligan, Wayne State U. <karen at linguistlist.org>
Assistant Editors: Lydia Grebenyova, E. Michigan U. <lydia at linguistlist.org>
Naomi Ogasawara, E. Michigan U. <naomi at linguistlist.org>
James Yuells, Wayne State U. <james at linguistlist.org>
Software development: John Remmers, E. Michigan U. <remmers at emunix.emich.edu>
Sudheendra Adiga, Wayne State U. <sudhi at linguistlist.org>
Qian Liao, E. Michigan U. <qian at linguistlist.org>
Home Page: http://linguistlist.org/
The LINGUIST List is funded jointly by Eastern Michigan University,
Wayne State University, and donations from subscribers and publishers.
Editor for this issue: Jody Huellmantel <jody at linguistlist.org>
==========================================================================
As a matter of policy, LINGUIST discourages the use of abbreviations
or acronyms in conference announcements unless they are explained in
the text.
=================================Directory=================================
1)
Date: Thu, 27 Jul 2000 14:37:22 +0200
From: TALN2000 <taln2000 at lia.di.epfl.ch>
Subject: RObust Methods in Analysis of Natural language Data (ROMAND2000)
2)
Date: Thu, 27 Jul 2000 17:48:05 +0100
From: Steven Bird <sb at UNAGI.CIS.UPENN.EDU> (by way of Nicholas Ostler)
Subject: Web-Based Language Documentation and Description
-------------------------------- Message 1 -------------------------------
Date: Thu, 27 Jul 2000 14:37:22 +0200
From: TALN2000 <taln2000 at lia.di.epfl.ch>
Subject: RObust Methods in Analysis of Natural language Data (ROMAND2000)
PLEASE NOTE THAT THE DEADLINE FOR ROMAND2000 HAS BEEN EXTENDED TO AUGUST 12th
First Call for Papers
ROMAND 2000
1st workshop on RObust Methods in Analysis of Natural language Data
*** EXTENDED DEADLINE ***
Department of Computer Science
Swiss Federal Institute of Technology - Lausanne
October 19-20 2000
http://lithwww.epfl.ch/romand2000/
ROMAND 2000 is the first of a series of workshop that aims at bringing
together researchers working on robust methods in natural language
processing. The term "natural language" is intended as all possible
modalities of human communication and it is not restricted to written
or spoken language. The main goal of the workshop will be to bring
together researchers working in fields like artificial intelligence,
computational linguistics, human-computer interaction, cognitive
science who are facing with the problem of feasible and reliable
systems implementation. Theoretical aspects of robustness in NLP are
welcome as well as engineering and industrial experiences.
The workshop will be held in collaboration with the TALN 2000
conference (le Traitement Automatique des Langues Naturelles -
Automatic Natural Language Processing) which will be held in Lausanne
from October 16th to 18th. The ROMAND workshop will be held just
afterwards, from the 19th to 20th.
We invite abstracts on all topics related to robustness in natural
language processing, including, but not limited to:
Robust Text Analysis
Information Extraction
Spoken Dialogue systems
Multimodal human-computer interfaces
Natural Language Architectures
NLP and Soft Computing
Robust Semantics
Underspecification
Multimedia document analysis
Robust Parsing
Complexity of linguistic analysis
Hybrid methods in computational linguistics
Text Mining
SUBMISSION PROCEDURE:
Authors should submit an anonymous extended abstract of at most 6 (included
references) single-column pages with 10' body font size (for talks with a
duration of 20' plus 10' discussion) together with a separate page specifying
the authors' names, affiliation, address, and e-mail address. The abstracts
should be submitted electronically (in postscript or pdf format) to:
romand at epfl.ch.
IMPORTANT DATES:
Papers due: August 12th
Acceptance notice: August 28th
Final version due: September 28th
Conference: October 19-20
WORKSHOP COMMITTEE:
Program chairs are
Afzal Ballim Afzal.Ballim at epfl.ch
Vincenzo Pallotta Vincenzo.Pallotta at epfl.ch
Hatem Ghorbel Hatem.Ghorbel at epfl.ch
Program committee
Steve Abney
Wolfgang Menzel
Jean-Pierre Chanod
Alberto Lavelli
Rens Bod
Giorgio Satta
Joachim Niehren
Roberto Basili
Maria Teresa Pazienza
Manuela Boros
Diego Molla' Aliod
Hervé Bourlard
B. Srinivas
C.J. Rupp
Peter Asveld
ORGANIZATION:
This year's workshop is organized in collaboration with the TALN 7eme
conférence annuelle sur LE TRAITEMENT AUTOMATIQUE DES LANGUES
NATURELLES ( http://liawww.epfl.ch/taln2000/ ). The workshop will
take place at the Swiss Federal Institute of Technology, Lausanne. The
workshop is endorsed by ATALA (Association pour le Traitement
Automatique des LAngues).
REGISTRATION:
Details about the registration procedure will be posted later at the
official web site. The registration fee will be:
Normal registration: 150.- CHF
For registered TALN attendee: 100.- CHF
FURTHER INFORMATION:
For any information related to the organization, please contact:
Vincenzo Pallotta
DI-LITH EPFL
IN F Ecublens
1015 Lausanne
Switzerland
tel. +41-21-693 52 97
fax. +41-21-693 52 78
Vincenzo.Pallotta at epfl.ch
News about the conference will be posted on the workshop's Web page at
http://lithwww.epfl.ch/romand2000/
-
Pour le comité d'organisation de TALN 2000,
For the organising committee of TALN 2000,
Cristian Ciressan
-------------------------------- Message 2 -------------------------------
Date: Thu, 27 Jul 2000 17:48:05 +0100
From: Steven Bird <sb at UNAGI.CIS.UPENN.EDU> (by way of Nicholas Ostler)
Subject: Web-Based Language Documentation and Description
CALL FOR PARTICIPATION
Web-Based Language Documentation and Description
Philadelphia USA, 12-15 December 2000
http://www.ldc.upenn.edu/exploration/
Institute for Research in Cognitive Science
University of Pennsylvania
Organizers: Steven Bird (U Penn) and Gary Simons (SIL International)
[The full version of this abridged CFP is available from the above page.]
This workshop will lay the foundation of an open, web-based
infrastructure for collecting, storing and disseminating the primary
materials which document and describe human languages, including
wordlists, lexicons, annotated signals, interlinear texts, paradigms,
field notes, and linguistic descriptions, as well as the metadata
which indexes and classifies these materials. The infrastructure will
support the modeling, creation, archiving and access of these
materials, using centralized respositories of metadata, data, best
practice guidelines, and open software tools.
BACKGROUND
Recent years have witnessed dramatic advances in the mass storage and
web delivery technologies, making it possible to house virtually
unlimited quantities of speech data online, and to disseminate this
data over the web. The development of XML and Unicode greatly
facilitate the interchange and reuse of structured multimodal and
multilingual data and the development of interoperating software
tools. These developments are having a pervasive influence on the way
primary linguistic data are gathered, stored, analyzed and
disseminated, as demonstrated by the initiatives surveyed on the
linguistic exploration page (http://www.ldc.upenn.edu/exploration/ ),
and the papers presented at the Linguistic Exploration Workshop at
the Chicago LSA Meeting (http://www.ldc.upenn.edu/exploration/LSA/ ).
CHALLENGES
With these new technological opportunities are concomitant needs
and challenges for modeling, creating, archiving and accessing data:
I Data Models. A diverse range of data types are required in language
documentation and linguistic fieldwork, including word lists,
lexicons, annotated signals, writing system documentation,
interlinear texts, paradigms, field notes, and linguistic
descriptions. We need flexible and general models for these data
types (including links between them), and good ways to represent
information which is either partial, uncertain, evolving, or
disputed. We need to develop a consensus in the community
regarding best practice for modeling these kinds of data, to
ensure maximal reusability of data and software.
II Data Archives. Whether just the private collection of a single
researcher or a large and centralized repository, language data
needs to be stored and reused. To support this, we need durable
and open storage and interchange formats that embody the best
practice consensus. We need to convert (parochial) 8-bit
character codings to Unicode, using a general tool for character
conversion along with a host of conversion tables for specific
character sets. We also need to convert markup into the best
practice formats we have defined. We need a mechanism to support
durable citation of data, so that document authors do not need to
duplicate all the data they reference just to be sure that the
links will not break. More generally, we need a metadata standard
for indexing the resources, regardless of format and availability,
and a wide-coverage index conforming to the standard, so that
someone interested in a particular language or region can find all
the electronic resources that are pertinent to it, without having
to determine how each of several different archives have named and
classified their holdings.
III Data Creation. Now that mass storage is so inexpensive,
researchers are creating large amounts of digital data covering
the types listed above. Both the number and scale of these
collection efforts are growing rapidly. We need software tools
supporting data creation, conforming with best practice, and
covering primary collection of textual data (wordlists, texts) and
recordings (audio, video, physiological), along with transcription
and annotation of the primary materials conforming to a broad
range of descriptive and analytical practices.
IV Data Access. Once data has been created and archived, there exist
a variety of access modes. A region of data is identified by
browsing, by launching a query, or by following a reference. The
selection is displayed according to appropriate conventions and
styles, or converted into some other form (e.g. for statistical
analysis and visualization). The selection may be corrected,
imported into a document, analyzed, and annotated, leading to the
creation of secondary data and/or the elicitation of new primary
data. We need to develop suitable delivery mechanisms including
stylesheets, conversion tools, indexing methods, and query
languages, which encompass the needs for security and privacy. We
need standard application programming interfaces and a library of
reusable components, to support the development of software for
new modes of access.
Many of the activities listed above are already underway; the lure of
the technology is great despite the lack of infrastructure. However,
it is beyond the capacity of any single individual or institution to
develop this infrastructure of standards and tools on their own. There
is a pressing need for close cooperation between these initiatives, so
that scarce human, software and data resources are used optimally.
WORKSHOP OBJECTIVES
This workshop will lay the foundation of an open, web-based
infrastructure for collecting, storing and disseminating the primary
materials which document and describe human languages. The
infrastructure will support the modeling, creation, archiving and
access of these materials, using centralized respositories of
metadata, data, best practice guidelines, and open software tools.
To meet this goal, we have identified three main objectives which can
be substantially achieved at the present time:
Objective 1: to develop a comprehensive framework which identifies all
the infrastructural needs, designates appropriate roles for
existing results as pieces of an overall solution, and sets out a
coordinated response to the remaining challenges.
Objective 2: to found centralized repositories (and nominate existing
ones) for housing components of the infrastructure, so that data,
tools, formats and standards can be collected, indexed, and made
available to the community.
Objective 3: to begin construction of the repositories, by identifying
the contribution of past and present activities by the
participants and by other individuals and institutions, and
by gathering the results and their documentation.
CALL FOR PARTICIPATION
The workshop will include paper presentations and working sessions to
develop the infrastructure. Interested members of the community are
invited to participate in the workshop. There is a limit on available
places, and participants will be identified on the basis of submitted
abstracts. Funding is available for authors of accepted papers.
Abstracts. One page abstracts are invited which describe substantive
contributions to the repositories, or which discuss concrete problems
for web-based language documentation and description, and describe
possible solutions.
Papers. Authors of accepted abstracts will be asked to prepare a
2-3,000 word paper plus associated materials.
Address submissions to: Steven.Bird at ldc.upenn.edu, Gary_Simons at sil.org
Timetable.
Friday 1 September Abstract deadline
Friday 29 September Acceptance notification
Friday 24 November Paper deadline
12-15 December Workshop
IMPORTANT: FOR FURTHER INFORMATION
Intending authors should consult the EXTENDED CFP, available from the
linguistic exploration page (http://www.ldc.upenn.edu/exploration/ ).
To be sure of receiving future announcements, please subscribe to the
LINGUISTIC-EXPLORATION mailing list, referenced from that page.
-
Steven Bird Gary Simons
University of Pennsylvania SIL International
Steven.Bird at ldc.upenn.edu Gary_Simons at sil.org
http://www.ldc.upenn.edu/sb http://www.sil.org/SIL/roster/simons.htm
---------------------------------------------------------------------------
LINGUIST List: Vol-11-1648
More information about the LINGUIST
mailing list