13.1870, FYI: Update: Translation, Endangered Langs

LINGUIST List linguist at linguistlist.org
Mon Jul 8 19:35:26 UTC 2002

LINGUIST List:  Vol-13-1870. Mon Jul 8 2002. ISSN: 1068-4875.

Subject: 13.1870, FYI: Update: Translation, Endangered Langs

Moderators: Anthony Aristar, Wayne State U.<aristar at linguistlist.org>
            Helen Dry, Eastern Michigan U. <hdry at linguistlist.org>

Reviews (reviews at linguistlist.org):
	Simin Karimi, U. of Arizona
	Terence Langendoen, U. of Arizona

Consulting Editor:
        Andrew Carnie, U. of Arizona <carnie at linguistlist.org>

Editors (linguist at linguistlist.org):
	Karen Milligan, WSU 		Naomi Ogasawara, EMU
	James Yuells, EMU		Marie Klopfenstein, WSU
	Michael Appleby, EMU		Heather Taylor, EMU
	Ljuba Veselinova, Stockholm U.	Richard John Harvey, EMU
	Dina Kapetangianni, EMU		Renee Galvis, WSU
	Karolina Owczarzak, EMU		Anita Wang, EMU

Software: John Remmers, E. Michigan U. <remmers at emunix.emich.edu>
          Gayathri Sriram, E. Michigan U. <gayatri at linguistlist.org>
          Zhenwei Chen, E. Michigan U. <zhenwei at linguistlist.org>

Home Page:  http://linguistlist.org/

The LINGUIST List is funded by Eastern Michigan University, Wayne
State University, and donations from subscribers and publishers.

Editor for this issue: Marie Klopfenstein <marie at linguistlist.org>


Date:  Mon, 8 Jul 2002 09:38:58 +0100
From:  andrius at ccl.bham.ac.uk
Subject:  Expert Training Course: Fasttrack to Translation

Date:  Mon, 8 Jul 2002 15:03:51 +0100
From:  <info at eldp.soas.ac.uk>
Subject:  The Endangered Languages Documentation Programme

-------------------------------- Message 1 -------------------------------

Date:  Mon, 8 Jul 2002 09:38:58 +0100
From:  andrius at ccl.bham.ac.uk
Subject:  Expert Training Course: Fasttrack to Translation

Expert Training Package



*** NEW DATES: September 17-20 ***

Extracting Translation Equivalents from Parallel Corpora

The Birmingham Centre for Corpus Linguistics (CCL) is pleased to
announce its 4-day Expert Training Package .FastTrack to Translation .
Extracting Translation Equivalents from Parallel Corpora., September
17-20, 2002. This package, aimed at professionals in translation and
other multilingual areas, builds on the success and experience of the
first one which took place in September last year.

Our expert training package uses parallel texts (original texts with
their translations) to help translators find suitable translation
equivalents. We present cutting edge research in the use of parallel
corpora for detecting translation equivalents. Methods are introduced,
using both monolingual and multilingual corpora, for exploring units of
meaning in texts. These units of meaning are often larger and more
complex than the simple word. Most units of translations are compounds,
collocations or even phrases. As for single words, most of them are
ambiguous. The participants will be shown methods on how the context can
be used to disambiguate words by investigating their contextual
profiles. The expert training package will also focus on retrieving the
translation equivalents and learning how the corpus data can help us
produce translated texts that display the .naturalness. of the target

One application of the translation units is to create new translation
databases which, for the first time, enable their users to translate
correctly into a foreign language of which they have only limited
command. Implemented into translation platforms, the databases will
facilitate translations more than customary translation memories. The
results and problems from current research will be presented and
discussed. The corpus approach is also relevant for terminology. A large
proportion of terminological material in new texts is neither
standardised nor even recorded in a termbank. Parallel texts taken from
the Internet are often the only source for finding translation

The expert training package will demonstrate how existing software can
be adapted and combined for the extraction of translation units and
their equivalents. Parallel corpora are available for a number of
European languages paired with English and other European languages.
There is also a Chinese-English parallel corpus. The participants will
also be invited to use monolingual corpora, such as the Bank of English,
during the hands-on sessions. The issues presented include segmentation,
lemmatisation, POS-tagging, sentence alignment, lexical alignment, and
the detection of units of meaning. We will demonstrate how context
profiles can be used to select the proper translation equivalents. We
will propose various suggestions to integrate corpus findings in
bilingual dictionaries, in multilingual termbanks and in databases of
translation equivalents.

Our expert training package includes lectures which focus on theoretical
approaches, instructions in the methodology and practice of multilingual
corpus linguistics, and software presentations. Experience gained from
our last course has encouraged us to shift the emphasis to supervised
hands-on sessions which take place in a modern computer lab.
Participants are invited to address specific topics and time will be set
aside in the timetable for those who want to take the opportunity to
present their own work.

We also put an emphasis on the social side of our packages. Every
evening, the participants are invited to explore the culinary diversity
our city has to offer. We will be feasting at restaurants in the
vicinity, visiting venues varying from a traditional carvery to
Birmingham.s home-grown Indian Balti cuisine (vegetarian options

Staff teaching this expert training package include: Professor John
Sinclair (The Tuscan Word Centre, Italy), Professor Michael Barlow (Rice
University, US), Professor Wolfgang Teubert (CCL, Birmingham), Dr
Pernilla Danielsson (CCL, Birmingham), Dr Maeve Olahan (UMIST), Jörg
Tiedeman (Uppsala); members of CCL and of Collins Dictionary Division,
Glasgow. More staff to be announced. The expert training package is
organised in conjunction with the Concerted Action TELRI (Trans-European
Language Resources Infrastructure) and with the Tuscan Word Centre.

Audience: The expert training package is targeted at professionals in
language industry, for example dictionary publishing, multilingual
language technology, and translation services. Minimum number of
participants: 8, maximum number: 15

Date and duration: The expert training package begins at 9.30 on Tuesday
September 17 and finishes at 22.00 on Friday September 20, 2002.

Fee: Participation per person: GBP 950 including coffee breaks, lunches
and dinners. Reduction in fee to GBP 850 for early registration before
May 13.

Accommodation: Participants are requested to make their own
reservations. We recommend Lucas House (University Guest House situated
5 minutes walk from the course venue) for accommodation (cost per
person, per night, single occupancy: GBP 51.97). Tel no: +44 (0)121 625
33 83 Fax no: +44 (0)121 414 6339
Preliminary Schedule:

Place: CETADL, University of Birmingham

.09.30 - 10.00: Programme presentation
.10.00 - 11.00: lecture (1): Corpus linguistics and lexicography
.11.00 - 11.20: coffee break
.11.20 - 12.20: resources and tools (1): Methods in monolingual concordancing
.12.20 - 13.00: hands-on session (1) Using WordSmith with exercises
.13.00 - 14.30: lunch
.14.30 - 15.30: lecture (2): Corpus linguistics and semantics
.15.30 - 16.00: coffee break
.16.00 - 16.45: resources and tools (2) Monolingual concordances 2
.16.45 - 17.45: hands-on session (2) Using MonoConc with exercises
.20.00 - 	: drinks/dinner


.09.30 - 10.00: announcements/discussion
.10.00 . 11.00: lecture (3):  Translation and the corpus
.11.00 . 11.30: coffee break
.11.30 . 12.00: resources and tools (3): Working with parallel
concordancer (ParaConc)
.12.00 . 13.00: hands-on session (3): Using ParaConc with
.13.00 . 14.30: lunch
.14.30 . 15.30: lecture (4): How to extract units of meaning from
.15.30 . 16.30: resources and tools (4): Working with translation
.16.30 - 17.00: coffee break
.17.00 . 17.30: hands-on session (4): Using Translation Software
.17.30 . 18.30: participants. presentations
.19.45 . 21.00: drinks/dinner


.09.30 - 10.00: announcements/discussion
.10.00 - 11.00: lecture (5): Three approaches to translation
.11.00 - 11.30: coffee break
.11.30 - 12.00: resources and tools (5): Working with databases
.12.00 - 13.00: hands-on session (5) Using translation databases
.13.00 - 14.30: lunch
.14.30 - 15.30: lecture (6): Corpus linguistics and terminology
.15.30 - 16.30: resources and tools (6) Term extraction and
terminological databases
.16.30 - 17.00: coffee break
.17.00 - 17.30: Hands-on session (6)
.17.30 - 18.30: participants. presentations
.19.45 - 21.00: drinks/dinner


.09.30 - 10.00: announcements/discussion
.10.00 - 11.00: lecture (8): The unit of meaning in translation
.11.00 - 11.30: coffee break
.11.30 - 12.00: resources and tools (7) Linguistically annotated
corpora: tagging and lemmatising
.12.00 - 13.00: hands-on session (6): Working with taggers and
.13.00 - 14.30: lunch
.14.30 - 15.30: lecture (9): Parallel corpora and bilingual
.15.30 - 16.30: Individual tutoring
.16.30 - 17.00: coffee break
.17.00 - 18.30: final discussion / conclusion
.20.00 - 22.00: farewell dinner

-------------------------------- Message 2 -------------------------------

Date:  Mon, 8 Jul 2002 15:03:51 +0100
From:  <info at eldp.soas.ac.uk>
Subject:  The Endangered Languages Documentation Programme

Please find below outline details of a new research programme for the
documentation of Endangered Languages.

Initial announcement

The Endangered Languages Documentation Programme

A.   A new research programme for the documentation of endangered languages.

There is a very strong prospect that a private foundation will initiate a
programme of grants to support the documentation of endangered languages,
and appoint the School of Oriental & African Studies, London University
[SOAS] to administer the scheme.  The prospective Invitation to Apply, which
is likely to be disseminated in late August, will contain full guidelines
and contact details for any further inquiries.  In the interim, no further
details will be made available and prospective applicants are requested to
avoid contacting SOAS with inquiries.

The purpose of this announcement is to indicate the rationale of the
putative programme and enable potential applicants to begin considering the
details of their possible proposals.

B.  Rationale.

The rationale of such a programme will be familiar to potential applicants:
the pace at which languages are becoming extinct is increasing throughout
the world.  Furthermore, since only about one-third of the world's languages
have literate traditions, the vast majority of languages which die will
leave no substantial record of themselves, or the cultural traditions that
they have sustained.  Quite apart from the loss of individual cultural
expressions, this process reflects a grave diminution in human and cultural
diversity and a loss of the knowledge on which they are based and which they

The objective of the proposed programme would be twofold:  to encourage the
development of linguistic fieldwork in endangered languages, especially by
younger scholars with a grounding in linguistic theory, who will thereby
also be provided with support between basic graduate work and the assumption
of university positions; and to support the documentation of as many
threatened languages as possible, focused on where the danger of extinction
is greatest, facilitating the preservation of culture and knowledge, and
creating repositories of data for the linguistic and social sciences, and of
course for indigenous communities. Such documentation should, therefore,
have regard not only to the formal content and structure of languages, but
also to the varied social and cultural contexts within which languages are
used.  In addition to the intellectual quality of applications, principal
grounds for support will be the degree of endangerment and the urgency of
the issues.

C.  Applications.

Applications will be invited from researchers  - who might include suitably
qualified research students or postdoctoral candidate, as well as senior and
established academics  - with qualifications in and, ideally, experience of
field linguistics.  It is anticipated that all applicants will have, or will
have developed in advance of funding, a formal link with (preferably an
established position in) a university or comparable research institution.

The core of the programme will probably be grants to support more or less
elaborate projects for the documentation of individual or closely related
endangered languages, involving one or more researchers and receiving
support for up to three or, in exceptional circumstances four, years.
However, individuals (including suitably qualified research students and
postdoctoral fellows) may apply for grants.

In the first instance applicants will be expected to submit a relatively
brief Summary Proposal Form.  These will be assessed and those, which appear
to conform to the programme's expectations as to importance and quality,
will be invited to submit a more detailed application.

It is anticipated that in this first 'round' the date for submission of
Summary Proposals will be mid-October 2002; invitations to submit detailed
applications will be despatched in late November 2002; and the closing date
for detailed applications will be early January 2003.

Detailed applications will have to conform to a variety of standards
(including ethical and technical standards), which will be specified in the
formal Invitation to Apply some time in late August.  Meanwhile, potential
applicants are requested not to contact SOAS.

LINGUIST List: Vol-13-1870

More information about the Linguist mailing list