FYI: Endangered languages documentation mailbot

T.Matthew Ciolek tmciolek at coombs.anu.edu.au
Wed Mar 27 23:53:19 UTC 1996


--------------------- forwarded message -------------------
Date: Wed, 27 Mar 1996 19:02:59 +0100
To: Endangered-Languages-L at coombs.anu.edu.au
From: ue303bh at sunmail.lrz-muenchen.de (Dietmar Zaefferer)
Subject: FYI: Endangered languages documentation mailbot



      Announcing LDUL: The Language Documentation Urgency List
      ========================================================

Abstract
========
LDUL is an automatic mailbox and database for the collection and retrieval
of information on how urgently the individual languages of this world are
in need of documentation. The aim is to help in the decision of where to
focus fieldwork and in the writing of proposals for fund raising purposes.

Overall documentation urgency (DU) is measured as the average of six special
documentation urgencies in phonology, morphology, lexicon, text corpus, syntax,
and semantics/pragmatics. These in turm are measured as overall degree of
endangerment times special degree of documentation need, where endangerment is
the inverse of estimated language vitality and special documentation need is the
inverse of the estimated sufficiency of existing special documentation.

The language vitality score is calculated from eight different factors
such as age of youngest speaker, number of speakers, percentage of
monolingual speakers etc. (for details cf. the comments to the
demoquestionnaire.)


Background
==========
Most linguists know that the number of human languages in use is rapidly
decreasing. Estimates of the rate of disappearance vary between 12 and 50 a
year. This means that up to 10% of the linguistic heritage of humankind
will be irretrievably lost by ten years from today and that by the end of
the coming century, less than 1000 of the current over 5000 languages will
still be alive. It seems obvious that where it is not possible to save
these languages, they should at least not die without leaving a trace, that
is without being documented in a satisfactory way.
 
This is not to deny that the task of saving the peoples who are sometimes
endangered together with their language is much more important; it's just
that this task is not specific to linguists! Those who wish to participate
in helping endangered peoples may contact organizations like 
- Survival International, 310 Edgware Road, London W2 1DY, UK;
  phone ++44-71-2421441, fax ++44-71-2421771, or 
- Gesellschaft fuer bedrohte Voelker, Postfach 2024, D-37010 Goettingen;
  phone ++49-551-499060, fax ++49-551-58028, 
  e-mail: gfbv-germany at oln.comlink.apc.org


Current activities
===================
The international community of linguists is not unaware of this situation.
To mention just a few activities: 
- in January 1991 a special symposium entitled "Endangered Languages and
  their Preservation" was held at the Annual Meeting of the LSA, 
- in 1991 a volume on "Endangered Languages", edited by R.H. Robins and
  E.M. Uhlenbeck, appeared (Berg, Oxford/New York),
- in March 1992 "Language" published a collection of essays in its vol. 68, and
- in August 1992 the XVth International Congress of Linguists devoted a
  plenary session to this topic, both with the same title as the
  Robins/Uhlenbeck volume, 
- in October 1992 the working group 'endangered languages' of the DGfS
  (German Linguistic Society) published an "Informationsbroschuere zur 
  Dokumentation von 'Bedrohten Sprachen'",
- in August/September 1993 the University of Cologne hosted a summer school
  on language description and fieldwork,  
- at its September 1993 meeting, the Linguistic Association of Great 
  Britain (LAGB) had a special session on Endangered Languages, 
- in July 1994, a workshop on Language shift and maintenance in the
  Asia-Pacific region was held at La Trobe University, Melbourne, 
- on the 7 Sep 1994 Endangered-Languages-L electronic forum at ANU was
  established (http://coombs.anu.edu.au/CoombsHome.html),
- at the January, 1995, LSA Meeting there was an organized session called
  'Field reports/Endangered Languages',
- in February 1995, there was a conference on Endangered Languages at
  Dartmouth college, and for
- on April 21, 1995, the University of Bristol held a seminar
  on the conservation of endangered languages.
- November 18-20, 1995, there was an International Symposion on Endangered 
  Languages at the newly founded ICHEL (International Clearing House for
  Endangered Languages, U of Tokyo, http://www.tooyoo.L.u-tokyo.ac.jp/).


What else can we do?
====================
Still, there remains a lot to be done. One central thing to do is FUND RAISING: 
We have to talk politicians, institutions, responsible into giving money
not only for the preservation and documentation of species of birds and
insects, but also of cultures and languages. Note that biologists (three
joint societies) are demanding $3 billion a year (sic!) for a documentation
of biodiversity called Systematics Agenda 2000 (Nature, vol. 368, 3 March 1994,
p.3). How much do we need for a documentation of glotto- and ethnodiversity?

Another thing is to MOTIVATE linguists to do the necessary field work once
the money is available. This has to include a change in hiring politics and
CV-evaluation: Time spent for field work should count as favorable
for a candidate, and not the opposite.

And a third thing is to provide the motivated and funded linguist with the
necessary INFORMATION on where to go first, since he is in the situation of a
fire fighter when fire is all over the place. And this is where the LINGUIST
discussion list (with over 7200 subscribers, more than 1.4 per existing
language) and similar ones come into play.


The contribution of the Language Documentation Urgency List (LDUL)
==================================================================

The world-wide computer networks and especially the LINGUIST LIST have
turned the world's community of linguists (at least its electronically
accessible part, but via them a lot more) into a global village. And if the
inhabitants of this village join forces, it should be easy to solve the
third problem mentioned above using the pot luck party method: Everybody
who knows about a language in need of proper documentation or in the
process of disappearing throws his knowledge into a pool called LDUL.
This is an automatic electronic mailbox and database with the following address:

        ldul at cis.uni-muenchen.de

If you want to know more about LDUL, simply a message to this address
with the following entry under "Subject" (the message body may be empty,
or, if your mail system doesn't tolerate this, contain anything, it will
be ignored):

        about LDUL

and you will receive the information you are reading right now. If you just
want a short overview of supported mail commands, send a message (again with
empty or trash body) with the following entry under "Subject":

        mail commands

If you think you have discovered a bug in the program, send a corresponding
message (no empty body!) with the following entry under "Subject":

        bug


Contributing to LDUL
====================
If you want to contribute, send a message to the LDUL address with the following
entry under "Subject":

        send demoquestionnaire

The system will mail you a copy of a completed questionnaire with annotations
that specify the different questions.

When you've done this you will hopefully want to complete an empty
questionnaire,
and you can get one by sending another message under "Subject": 

        send questionnaire

The system will send you by return mail a copy of a blank questionnaire.

Once you have completed a questionnaire (please read the annotations to the
different questions that come with the demoquestionnaire carefully!), write 

        deposit questionnaire 

into the subject field and mail the completed questionnaire to the same address.

The way it is treated there is the following:

If the language code on the questionnaire you have completed is identical with
the language code in a questionnaire already on file, your contribution is
added to that file, else a new file is opened. So it's the language code that
counts for the identification of a language and not its name(s), since there
are too many ambiguous language names! If you deposit a questionnaire without
the language code, LDUL will add it for you, if your language name is 
unambiguous, else it will complain.

For each vitality factor you have either checked one of the five values
(there is a minimum, a maximum, and three intermediate degrees), or the
option 'unknown'.
  
The weights of the first vitality factor (age of youngest speaker) are 0, 4,
8, 12, and 16 points, the weights of the other seven factors are 0, 3, 6, 9, 
and 12 points each. The total vitality score therefore ranges from 0 (worst
case, lowest vitality) to 100 (best case, highest vitality).
 
If you are not in a position to give information on all the factors,
the vitality score will be an interval rather than a point value.
Suppose all other factors add up to 20, but the language attitude is unknown.
Then the vitality score will be the interval between 20 (worst case, very
negative attitude) and 32 (best case, very positive attitude). 
If another questionnaire on the same language contains information about
language attitude, your 'unknown' contribution is simply ignored.

The sufficiency (quantity and quality) of existing documentation in a given
domain such as phonology is given a score of 1, if it is very high (completely
sufficient), and 0, if it is completely unsufficient, with the three obvious
intermediate values.

The special Documentation Urgency or DU scores are calculated as 
(100 - vitality score) * (1 - special documentation sufficiency score).
For instance the text corpus DU score will be 100 (the maximum) just in
case vitality is 0 and the text corpus documentation sufficiency score
is 0 as well. The overall DU score as the average of the special DU scores
will therefore be 100 (the maximum) just in case vitality is 0 and all
special documentation sufficiency scores are 0 as well.

If you check only 'unknown' in either the vitality or the documentation
section, you will receive a corresponding error message:
 
        The error occurred while parsing :
        No vitality information specified in Section 2 !

or 

        The error occurred while parsing :
        No documentation information specified in Section 3 !


Consulting LDUL
===============
If you want to consult the list, you have several choices.
 
1. If you are interested in a specific language, say Lisu, send a message
to the same address as above with the following entry under "Subject":

        info on Lisu

Then LDUL will mail you the set of questionnaires that have been completed
with Lisu in the list of names and aliases. The subject line of the message
will look like follows:

Subject: Re:info on lisu ( TLC=lis )

The (TLC=lis) information is important here, since there is another 
language named Lisu with a different three letter code (TLC=tkl), and if
information is on file on that language as well, it will also be in your
mail. So if you are in doubt about the identity of the language you are
inquiring about, please consult the "Ethnologue Database" to find out
among other things the three letter code of your language, e.g. via the
World Wide Web:
http://www.sil.org/ethnologue/ethnologue.html
    or
http://www-ala.doc.ic.ac.uk/~rap/Ethnologue/
    or via Gopher:
gopher://sil.org/11/gopher_root/ethnologue/

2. If you want to know which languages have been treated so far, send the
subject entry

        get languages

to LDUL, and it will mail you the current alphabetical list of names of
languages about which information is on file.

3. If you want to know the current overall documentation urgency (DU)
ranking of the languages that have been treated so far, send the subject
entry

        get overall DU ranking

and LDUL will send you a list of the languages on file, ranked according to
their overall DU scores.

4. If you want to know one of the current special documentation urgency
(DU) rankings of the languages that have been treated so far, send one of
the subject entries

        get phonology DU ranking
        get morphology DU ranking
        get lexicon DU ranking
        get text corpus DU ranking
        get syntax DU ranking
        get semantics/pragmatics DU ranking

to LDUL, and it will send you a list of the languages on file with their
appropriate special scores, ranked according to these scores.
 
5. If you want to know the complete current statistics, send the message

        get statistics

to LDUL, and it will mail you its complete statistics in its current state:
- the number of completed questionnaires, 
- the number of languages treated, 
- the average number of completed questionnaires per language, 
- the list of languages ranked according to their overall DU score,
  giving for each language 
  - the language code, 
  - all language names collected so far, 
  - the number of completed questionnaires for this language, 
  - the overall documentation urgency score for this language together with
  - the special urgency rankings of its aspects (phonology etc.),
    with the average value first followed a maximum and a minimum value,
    (the latter two matter only for interval values and can be neglected
    if they coincide)
  - the language's vitality score,
  - its endangerment score,
  - its overall documentation score, and
  - its overall documentation need.

Since this file will soon be rather bulky, be careful with this command.


Spreading the word
==================
Those who don't have access to email but might have relevant knowledge
should be addressed by snail mail, fax, phone or face-to-face and should be
encouraged to pass on the news in order to achieve some kind of snowball
effect. Information on paper questionnaires will have to be transferred to
the computer by somebody who does have access to email.


!!!!!!!!!!!!!!!!!!!!!! WARNING !!!!!!!!!!!!!!!!!!
=================================================
There is no such thing as a perfect questionnaire and it should go without
saying that no linguist wishing to do fieldwork should base the choice of
his language exclusively on the LDUL data. 

One factor that had to be neglected in the design of the questionnaire
is the degree of relatedness of the language in question to the 'next' well-
documented language: the lower this degree, the higher the documentation
urgency.

Another thing is that the program cannot resolve contradicting information
on the same language. It will rather compute the average scores, e.g. if
one contributor thinks the quantity and quality of the documentation on
Tsachila phonology is medium (.5) and another one thinks it is low (.25),
it will come up with a score of .375.

The whole enterprise is a first and as such an experiment, but I think it
is worth trying. Let me conclude with a somewhat pathetical appeal:

                     Linguists of all countries, unite
                     and try to save what can be saved
        of what you love enough to devote your professional life to:
                            contribute to LDUL!


Dietmar Zaefferer                                                 
Institut fuer Deutsche Philologie    Phone: +49 89 2180 2060 (office) or
Universitaet Muenchen                       +49 89 2180 3819 (office)
Schellingstr. 3                             +49 89 36 66 75  (home)        
D-80799 Muenchen                     Fax:   +49 89 2180 3871 (office)
Germany                              Email: ue303bh at sunmail.lrz-muenchen.de


Acknowledgements
================
I am indebted first of all to "jolly at cis.uni-muenchen.de" alias Patrick
Stein, who did the programming, then to Franz Guenthner for organizational
support, and last but not least all those who commented on earlier, more
naive versions of the questionnaire (none of whom may be held responsible 
for remaining shortcomings of it):

- Leila Behrens
- Bernard Comrie
- Bill Croft
- Christian Lehmann
- Hans-Juergen Sasse
- Christel Goldap
- Gunter Senft
- Vladimir Tourovski
- Nancy Dorian
- Martin Haspelmath
- Barbara F. Grimes


---------------------------------------------------------------------
PD Dr. Dietmar Zaefferer                                                 
Institut fuer Deutsche Philologie       phone: +49 89 2180 2060 (office) or
Universitaet Muenchen                          +49 89 2180 3819 (office)
Schellingstr. 3                                +49 89 36 66 75  (home)        
D-80799 Muenchen                        fax:   +49 89 2180 3871 (office)
Germany                                 email: ue303bh at sunmail.lrz-muenchen.de
---------------------end of forwarded message -------------------


- regards -





More information about the Endangered-languages-l mailing list