CFP: First Steps for Language Documentation of Minority Languages
Steven Bird
sb at CS.MU.OZ.AU
Fri Dec 19 20:21:02 UTC 2003
C A L L F O R P A P E R S
4th International SALTMIL (ISCA SIG) LREC workshop on
First Steps for Language Documentation of Minority Languages:
Computational Linguistic Tools for
Morphology, Lexicon and Corpus Compilation
24 May 2004, Lisbon, Portugal
http://193.2.100.60/SALTMIL/
Motivation and Aims
The minority or lesser used languages of the world are under increasing
pressure from the major languages (especially English), and many of them lack
full political recognition. Some minority languages have been well researched
linguistically, but most have not, and the vast majority do not yet possess
basic speech and language resources (such as text and speech corpora) which
are sufficient to permit research or commercial development of products.
If this situation were to continue, the minority languages would fall a long
way behind the major languages, as regards the availability of commercial
speech and language products. This in turn will accelerate the decline of
those languages that are already struggling to survive. To break this vicious
circle, it is important to encourage the development of basic language
resources as a first step.
The workshop is intended to continue the series of SALTMIL (ISCA SIG) LREC
workshops:
1) "Language Resources for European Minority Languages" (LREC1998) Granada,
Spain.
2) "Developing Language Resources for Minority Languages: Re-usability and
Strategic
Priorities" (LREC2000) Athens, Greece.
3) "Portability Issues in Human Language Technologies " (LREC2002) Las
Palmas de Gran Canaria, Spain.
The proposed workshop aims to share information on tools and best practice, so
that isolated researchers will not need to start from scratch. An important
aspect will be the forming of personal contacts, which can minimise
duplication of effort. Information on sources of funding for minority
languages will also be presented, and there will be discussion on the
strategic priorities that need to be addressed in this area. There will be a
balance between presentations of existing language resources, and more general
presentations designed to give background information needed by all
researchers present.
One potential means of ameliorating this imbalance in technology resources is
through encouraging research in the portability of human language
technology for multilingual application.
Topics of Interest
The workshop will focus on the following topics and languages:
* Existing projects in the field, with the opportunity to share useful
information
* Presentations of existing speech and text databases for minority
languages,
with particular emphasis on software tools that have been found
useful in their development.
* Linguistic corpora
* Automatic Speech Recognition
* Acoustic modelling
* Dictionary development
* Language modelling .
* Natural Language Processing:
* Computational lexicography
* Morphology
* Syntax
* Machine Translation.
* Information retrieval
Agenda
The first session of the workshop will consist of invited talks focusing on
current methodologies for language documentation and computational
linguistic tools which are available for minority languages. Each invited
speaker will be asked to comment on the following:
* how current research relates to minority languages, perhaps indicating
how they would approach their work within this context
* which methodologies and tools they find most useful
* which of those methodologies are defined as portable for different
languages.
* how these tools could extend the use of the language
* how these basis could be used in further work on HLT
The second session will be an oral session focusing on programmes and
initiatives for supporting minority language documentation. The main aim of
this session is to provide a forum for fostering new contacts among
researchers working in this area.
Invited speakers
* Dafydd Gibbon, Univ. Bielefeld.
"First steps in corpus compilation"
* Xabier Artola, Ixa group, Univ. of the Basque Country.
"First steps in lexicon resources"
* Bojan Petek, University of Ljubljana. Slovenia.
Experiences defining a Network of Excellence
on Portability of Human Language Technologies
* Kenneth R. Beesley, Xerox (to be confirmed)
"First steps in morphology"
Workshop Organizing and Program Committee
Bojan Petek, University of Ljubljana. Slovenia
Julie Berndsen, University College Dublin, Ireland
Oliver Streiter, EURAC; European Academy, Bolzano/Bozen, Italy
Atelach Alemu, Addis Ababa University. Ethiopia
Kepa Sarasola,University of the Basque Country, Donostia
Submission
Papers are invited that describe research and development in the area of
Human Language Technology portability. All contributed papers will be
presented in poster format. Each submission should include: title;
author(s); affiliation(s); and contact author's e-mail address, postal
address, telephone and fax numbers. Abstracts (maximum 500 words,
plain-text format) should be sent via email to:
Julie Berndsen Julie.Berndsen at ucd.ie
All contributions (including invited papers) will be printed in the
workshop proceedings (CD). They also will be published on the SALTMIL website.
Submissions of papers for poster presentations should follow
the same style as the ones for regular LREC paper and not be longer than
6000 words. The final details will be published as soon as they become
available.
We allow simultaneous paper submission to the workshop and the LREC
main conference. If a paper is accepted by both the conference and the
workshop, the paper will be presented at the conference, rather than at
the workshop. The author(s) should notify the workshop chair.
Important Dates:
Deadline for workshop abstract submission 11th February 2004
Notification of acceptance 25th
February 2004
Final version of the paper for the workshop proceedings 1st April 2004
Workshop 24
May 2004, morning
Workshop Registration Fees
The registration fees for the workshop are:
·If you are not attending LREC: 85 EURO
·If you are attending LREC: 50 Euro
These fees will include a coffee break and the Proceedings of the Workshop.
Registration will be handled by the LREC Secretariat.
More information about the Endangered-languages-l
mailing list