11.294, Sum: Reactions to Languages Listed in ISO 639

Sun Feb 13 02:23:33 UTC 2000

LINGUIST List:  Vol-11-294. Sat Feb 12 2000. ISSN: 1068-4875.

Subject: 11.294, Sum: Feedback/Languages Listed in ISO 639

Moderators: Anthony Rodrigues Aristar: Wayne State U.<aristar at linguistlist.org>
            Helen Dry: Eastern Michigan U. <hdry at linguistlist.org>
            Andrew Carnie: U. of Arizona <carnie at linguistlist.org>

Reviews: Andrew Carnie: U. of Arizona <carnie at linguistlist.org>

Associate Editors:  Martin Jacobsen <marty at linguistlist.org>
                    Ljuba Veselinova <ljuba at linguistlist.org>
		    Scott Fults <scott at linguistlist.org>
		    Jody Huellmantel <jody at linguistlist.org>
		    Karen Milligan <karen at linguistlist.org>

Assistant Editors:  Lydia Grebenyova <lydia at linguistlist.org>
		    Naomi Ogasawara <naomi at linguistlist.org>
		    James Yuells <james at linguistlist.org>

Software development: John H. Remmers <remmers at emunix.emich.edu>
                      Sudheendra Adiga <sudhi at linguistlist.org>
                      Qian Liao <qian at linguistlist.org>

Home Page:  http://linguistlist.org/

Editor for this issue: Karen Milligan <karen at linguistlist.org>

=================================Directory=================================

1)
Date:  Sun, 6 Feb 2000 22:43:32 +0000
From:  Nicholas Ostler <nostler at chibcha.demon.co.uk>
Subject:  ISO 639: reactions from users/potential users

-------------------------------- Message 1 -------------------------------

Date:  Sun, 6 Feb 2000 22:43:32 +0000
From:  Nicholas Ostler <nostler at chibcha.demon.co.uk>
Subject:  ISO 639: reactions from users/potential users

[
Note from Nicholas Ostler:
As before, this is John Clews' message, (sent Sun, 06 Feb 2000 12:16:57 GMT )
and replies or comments should go to him direct at
Endanger at sesame.demon.co.uk (John Clews)
]
Query Issue LINGUIST 11.230

This week I posted a questionnaire to various email lists, based on
the comparative list of LC-MARC codes, ISO 639-2 codes and ISO 639-1
codes, that you have seen before.

There was a great deal of interest in this from various linguists in
several parts of the world. Some of the comments may be useful, and
some less so, and we may or may not want to deal with all the
languages discussed.

I would like to thank all who sent replies about the codes or
languages, and would like to feed this digest, without any comment at
this stage, back to the lists whose members responded so well, and so
quickly.

The responses cover several different groups of languages, and are
arranged just in the order they were received, not by any language
order. However, as it happens several responses on South Asian
languages are all together, further down, as several responded from
the South Asian Linguists list <VYAKARAN at LISTSERV.SYR.EDU>, and my
apologies to subscribers to both lists who have therefore seen this
twice. Text below this point is identical in the postings to both
lists.

Some of the replies may suggest additional codes, or additional
strategies, for what we do in the ISO 639 JAC, and/or in other codes
lists.

If I receive any other replies after this, I shall also consider any
further information which adds to what has already been sent.

Best regards

John Clews

NB: Replies are separated by dashes, thus:

- ----------------------------------------------------------

> Date: Thu, 03 Feb 2000 20:03:20 -0800
> To: John Clews <Endanger at sesame.demon.co.uk>
> From: "John A. Halloran" <seagoat at primenet.com>
> Subject: Sumerian language code
> Mime-Version: 1.0
> Content-Type: text/plain; charset="us-ascii"
> Status: R

How did Sumerian get to be sux, when sum is not in use. That doesn't
make any sense.

Regards,

John Halloran
http://www.sumerian.org/
e-mail: seagoat at primenet.com

- ----------------------------------------------------------

> Date: Thu, 03 Feb 2000 23:09:06 -0500
> To: John Clews <Endanger at sesame.demon.co.uk>
> From: Claire Bowern <bowern at fas.harvard.edu>
> Subject: Language codes

Hi. I work on Australian languages. It's a shame you've only got two codes
for about 800 languages - one for "Australian" - aus - and one for
"Papuan-Australian (other)" - paa. No separate codes for the 180 odd
Australian languages and no codes for the 700 odd Papuan ones?  I'm not
sure what the latter one would refer to - there aren't any real connections
between Papuan and Australian languages. You've got codes for dead
languages, but nothing for Arrernte, Warlpiri, Tolai, Motu or
Pitjantjatjara? There's a lot more writing going on in Arrernte than there
is in Zuni, but . How about including some more languages from Oceania?

Regards,

Claire Bowern

Department of Linguistics
Harvard University
305 Boylston Hall
Cambridge, MA  02138
ph: 617-493-4230
http://www.fas.harvard.edu/~lingdept/

- ----------------------------------------------------------

> Date: Wed, 3 Feb 1999 20:59:26 -0800 (PST)
> From: David Robertson <drobert at tincan.tincan.org>
> To: endanger at sesame.demon.co.uk
> Subject: ISO 639 & ChInuk (Chinook Jargon)

Hello, John,

Thanks for your posting on LINGUISTLIST about ISO 639 codes.

It's good to see that there is a code, chn, for ChInuk (Chinook Jargon).

I want to let you know that if and when the ISO get down to the process of
establishing a standard character-set for this language, I and the CHINOOK
list stand ready to advise and help.

ChInuk is fortunate in that quite a few technically savvy people are
involved in the preservation and dissemination of the language.  In fact,
we've discussed ISO before on our list.  Between the linguists and
computer science people in our ranks, we can provide good feedback to ISO
if, as we hope, we are called upon.

I can be reached at the above address; CHINOOK list postings go to
chinook at listserv.linguistlist.org.

Lhush pulakli!
Dave

 *VISIT the archives of the CHINOOK jargon and the SALISHAN & neighboring*
                    <=== languages lists, on the Web! ===>
           http://listserv.linguistlist.org/archives/salishan.html
           http://listserv.linguistlist..org/archives/chinook.html
- ----------------------------------------------------------

> Date: Thu, 3 Feb 2000 23:06:25 -0600 (CST)
> From: "James L. Fidelholtz" <jfidel at siu.buap.mx>
> To: John Clews <Endanger at sesame.demon.co.uk>
> Subject: Re: 11.230, Qs: Feedback sought on languages listed in ISO 639

[General comment:  there are various cases (eg, alg, sal) where lg
families are given, but not (all) the individual languages, which seems
to go against the stated purpose of the list, insofar as I understand
it. Specifically, there does not seem to be any reason NOT to be as
complete as possible in such a list.  In that sense, an appropriate
tactic might be to use the Ethnologue (eg) basically in its entirety.
If there are any criteria for excluding lgs., I missed them in your
explanations.  Being currently spoken seems a bad one, as does being
spoken by X number of people as a minimum.  Below I have interspersed
my comments within brackets in the list, more or less in the
appropriate place, although not always and only alphabetically
well-placed]

>- ----------------------------------------------------------
>  LC  ISO 639-2   ISO 639-1  Language name in English
>- ----------------------------------------------------------
>      alg                    Algonquian languages

[Note that there is also one of the Alg. lgs. called "Algonquin"
(sometimes); among the missing is Menominee ("men")]

[add]

[Beothuk "beo" -- a possibly Algonquian lg. not spoken since 1832]

>      lus                    Lushai
[See Salish]

>      mic                    Micmac

[Note: the PC name is now 'Mi'gmaq'; this could be "mi'" (if single
quotes are permitted), or "mig"; Algonquianists tend to use "mc"]

[add]

[missing: Mixe "mix"]

>      nah                    Nahuatl (LC listed earlier as Aztec)

[In some circles, this is written "Nauatl", without any accents]

>      nai                    North American Indian (Other)

[a rather diffuse category, esp. considering the ambiguity of "North
American" (eg, are Mexican lgs. included--even many linguistic
classification schemes do not include them]

>      sal                    Salishan languages

[Note: lots of individual lg names missing in this list for Salish lgs.;
specifically, the lg. formerly known as Skagit (?ska), now with the PC
name Lushootseed (?lus), in any case virtually no longer spoken (does
this matter?)]

[add]

[Totonac(o), presumably "tot" is missing here, as is Tepehua, presumably
"tep", the only two clear members of their family, possibly related to
Huasteco--also missing, "hua"?-- and Mayan--your "myn", but actually
consisting of many languages/varieties, eg Lacandon "lac"]
[missing: Zoque "zoq", possibly Mayan]
- -
>LINGUIST List: Vol-11-230
>

James L. Fidelholtz                     e-mail: jfidel at siu.buap.mx
Maestr=EDa en Ciencias del Lenguaje
Instituto de Ciencias Sociales y Humanidades
Benem=E9rita Universidad Aut=F3noma de Puebla, M=C9XICO

- ----------------------------------------------------------

> Date: Fri, 4 Feb 2000 11:01:02 +0000 (GMT)
> From: Dr Tony McEnery <mcenery at comp.lancs.ac.uk>
> To: John Clews <Endanger at sesame.demon.co.uk>
> Subject: Re: Languages listed in ISO 639: feedback sought

Hi John,

I think the work you are doing is splendid. The language I have been
working most closely with recently is Sylheti - Nick beat me to the
draw on getting that one added to your list. Though not an 'official'
language, there is a growing movement in the UK Bengali community at
least to identify it as such and revivify its writing system, Syheti
Nagri.

Best,

T

Dr. Tony McEnery,
Reader in Multilingual Corpus Linguistics,
Dept. Linguistics,
Lancaster University,
Lancaster,LA1 4YT, UK.

Tel: +44 (0) 1524 593024
Fax: +44 (0) 1524 843085
email: mcenery at comp.lancs.ac.uk (JANET)

- ----------------------------------------------------------

> From: "A.F. GUPTA" <engafg at ARTS-01.NOVELL.LEEDS.AC.UK>
> Organization: University of Leeds
> To: Endanger at sesame.demon.co.uk
> Date: Fri, 4 Feb 2000 12:28:15 GMT

You seem to have only one code for CHINESE.  That's OK for the
written language, but I think you certainly need codes for at least
some of the major varieties of Chinese in speech, in line with normal
practice of using CHINESE mostly for the written language.

You certainly need:

Mandarin
Cantonese
Hokkien

and probably several others.

cpe, cpf, cpp should all be OTHER than named cpe/f/p s.  You have
Papiemento and Bislama -- but you need more named creoles.  Omissions
I noticed which should certainly be there are Haitian Creole,
Kristang, PNG Pidgin.

Hope this is useful.

Anthea

Anthea Fraser GUPTA : http://www.leeds.ac.uk/english/$staff/afg
School of English
University of Leeds
LEEDS LS2 9JT
UK

- ----------------------------------------------------------

> Date: Fri, 4 Feb 2000 09:32:28 -0500
> To: <Endanger at sesame.demon.co.uk>
> From: dbeck at umich.edu (David Beck)
> Subject: Re: 11.230, Qs: Feedback sought on languages listed in ISO 639

Below is an additional family, Totonacan-Tepehua (Mexico), which sees to
have been ommitted. This is a large family with over 100,000 speakers of
eight or ten languages.

       tot                    Totonacan-Tepehuan

David Beck
Visiting Assistant Professor
Programme in Linguistics
University of Michigan
Room 1087 Frieze Building
105 South State St.
Ann Arbor, MI 48109-1285
office: (734) 647-2156
FAX:    (734) 936-3406
e-mail: dbeck at umich.edu
http://www-personal.umich.edu/~dbeck/

- ----------------------------------------------------------

> Date: Fri, 4 Feb 2000 09:08:55 -0600 (CST)
> From: Gregory David Anderson <gdanders at midway.uchicago.edu>
> X-Sender: gdanders at harper.uchicago.edu
> To: Endanger at sesame.demon.co.uk
> Subject: Language list codes

Hi, I noticed your list of language codes and wanted to draw your
attention to some oversights on the list. In particular, the languages of
Siberia are very poorly represented in your list.

Namely, there seems to be entire families missing including
Ket (Yeniseian)
Nivkh (aka Gilyak, language isolate)
Yukaghir (isolate, or sister to Uralic)
Chukotko-Kamchatkan (Chukchi/Chukchee, Koryak, and Itel'men (Kamchadal).

At least mention of these entire families should be made.

A propos to Munda languages, I noticed you had Santali separate from the
rest. I would suggest having Mundari-Ho (ca. 2million speakers with a
literature) as well as South Munda languages Kharia and Sora, the rest
could/should be an 'other' category
 These latter are not as important, I think, than the oversight of entire
language families which occupy a large area in central and eastern
Siberia.

I hope this is of some use to you.

Greg Anderson

Gregory D. S. Anderson
Department of Linguistics
University of Chicago
1010 E. 59th St.
Chicago, IL  60637  USA

- ----------------------------------------------------------

> Date: Fri, 4 Feb 2000 09:12:31 -0600 (CST)
> From: Gregory David Anderson <gdanders at midway.uchicago.edu>
> X-Sender: gdanders at harper.uchicago.edu
> To: Endanger at sesame.demon.co.uk
> Subject: Further omissions from list

hi, one more omission I noted in your list

Burushaski--a language isolate from Pakistan

Greg Anderson

Gregory D. S. Anderson
Department of Linguistics
University of Chicago
1010 E. 59th St.
Chicago, IL 60637  USA

- ----------------------------------------------------------

> To: Endanger at sesame.demon.co.uk
> Subject: ISO language codes
> Date: Fri, 04 Feb 2000 11:34:04 -0500
> From: Taylor Roberts <troberts at MIT.EDU>

Hi, thanks for posting your list to LINGUIST!  I don't have anything
in particular to say about the codes themselves, but I thought I would
contribute an alternate spelling of 'Pushto'.  There are several ways
this has been rendered, but I think 'Pashto' must be used just as
often.  If thjis alternate spelling could be included, it might be
helpful--though I realize that the names are based on LOC, and that
the codes are your main point of interest.

Thanks and best wishes,
Taylor

- ----------------------------------------------------------

> Date: Thu, 3 Feb 2000 22:59:11 -0600
> To: Endanger at sesame.demon.co.uk
> From: Rick Mc Callister <rmccalli at sunmuw1.MUW.Edu>
> Subject: Re: 11.230, Qs: Feedback sought on languages listed in ISO 639

        This is a very good idea. However, if it's something for
international consulmption, I strongly suggest that you use codes based on
the speakers name for the language. If that name is unknown in the case of
extinct languages, you might wish to use the Latin name for Western
European languages and other local prestige langauges for other
regions/continents. I think the use of English plays into the widespread
notion that ascii and the computing infrastructure in general is
Anglocentric at best and possibly racist
        e.g. esk, eus for Basque, esp for Spanish, deu for german, nih for
Japanese, gai for Scots Gaelic/Gaidhlig, gae for Irish Gaelic/Gaeilge, cym
for Welsh, etc.
        I have read that Wendish is a somewhat deprecative name for
Sorbian/Lusatian

- ----------------------------------------------------------

> Date: Fri, 04 Feb 2000 10:05:37 -0700
> To: John Clews <Endanger at sesame.demon.co.uk>
> From: Caroline L Rieger <crieger at gpu.srv.ualberta.ca>
> Subject: languages listed in ISO 639
> Mime-Version: 1.0
> Content-Type: text/plain; charset="us-ascii"

Dear John Clews,

I missed Luxembourgish (Letzeburgesch) in your list. It is a Germanic
language that does not have too rich a literary tradition, but one that
started in 1829. Recently, more and more authors from Luxembourg pride
themselves to publish in Luxembourgish. The corpus is thus growing rapidly.

If you need more information, please feel free to contact me.

Yours sincerely,

Caroline L. Rieger
**********************
Caroline L. Rieger
Ph.D. Candidate
University of Alberta
Dept.of Modern Languages & Cultural Studies
200 Arts Bldg.
Edmonton, AB  T6G 2E6
Canada
Phone: 001 - 780 - 438 - 1062
Fax: 001 - 780 - 492 - 9106
E-mail: crieger at ualberta.ca

- ----------------------------------------------------------

> Date: Fri, 04 Feb 2000 11:12:24 -0600
> To: Endanger at sesame.demon.co.uk
> From: Jill Wagner <jmwagner at iastate.edu>
> Subject: ISO lang names

When you said "omissions" I'm assuming you meant of languages.

I work on the interior salish language spoken in northern Idaho USA
commonly referred to as Coeur d'Alene and commonly abbreviated CdA. In a
cursory check of the list, I did not see this included.  The indigenous
name for the language, Snchitsu'umshtsn, is rarely used even by tribal
members and speakers, but is worth noting.

- ----------------------------------------------------------

> From: Mark_Mandel at Dragonsys.com
> To: Endanger at sesame.demon.co.uk
> Message-ID: <8525687B.006F42D7.00 at notes-mta.dragonsys.com>
> Date: Fri, 4 Feb 2000 15:24:15 -0500
> Subject: Additions to ISO 639

With reference to the LINGUIST List announcement at
http://linguistlist.org/issues/11/11-230.html :

(1) I suggest adding
     kli  Klingon

This may seem a joke to you, but there is probably more material in Klingon
than
in Volapuk (vol), and unlike Volapuk the amount is steadily growing. The
Klingon
Language Institute (www.kli.org) and its quarterly journal _HolQeD_
(http://www.kli.org/study/HolQeD.html ; ISSN: 1061-2327; catalogued by MLA) are
the centers of study of this language, originally developed by Dr. Marc Okrand.

(2) "Sign languages (not expanded further)" is about as acceptable as would be
"Asian languages (not expanded further)". I will start with proposing
     asl  American Sign Language
and continue by posting a message about your request on SLLING-L, the Sign
Language Linguistics List, to elicit contributions from sign linguists familiar
with others of the dozens or hundred-plus of known sign languages in the world.

   Mark A. Mandel : Senior Linguist and Manager of Acoustic Data
         Mark_Mandel at dragonsys.com : Dragon Systems, Inc.
 320 Nevada St., Newton, MA 02460, USA : http://www.dragonsys.com/
                     (speaking for myself)
- ----------------------------------------------------------

> Date: Fri, 04 Feb 2000 18:14:43 +0000
> From: =?iso-8859-1?Q?Ant=F3nio?= Emiliano <a.emiliano at mail.telepac.pt>
> Reply-To: a.emiliano at mail.telepac.pt
> Organization: Universidade Nova de Lisboa / Dep. de
>=?iso-8859-1?Q?Lingu=EDstica?=
> To: John Clews <Endanger at sesame.demon.co.uk>
> Subject: Languages listed in ISO 639: feedback sought

Dear Sir

I would like to propose that "Mirands" (with a circumflex over the E)
be included in ISO 639.

It is the sole minority language native to Portugal, and is spoken in
the North-West. It was originally a dialect of Leonese, the language
of the old kingdom of Len.

An orthography has recently been developed, and Mirands is now taught
in school (from grammar school level).

I would suggest that the 2 letter code "MD" be included in ISO 639-1,
and the 3 letter code "MIR" be included in ISO 639-2.

For more information on this language you can contact the Linguistics
Centre of the University of Lisbon (Centro de Lingustica da
Universidade de Lisboa), where dialectologists can answer any query of
yours. Their URL is: http://www.clul.ul.pt.

Best regards

Antnio Emiliano

NB: phone numbers in Portugal have changed as of 31 Oct 99

 Dr Antonio H A Emiliano,
 Asst. Professor of Linguistics
 UNIVERSIDADE NOVA DE LISBOA
 Faculdade de Ciencias Sociais e Humanas
 Departamento de Linguistica
 Avenida de Berna, 26 - C
 1069-061 LISBOA  PORTUGAL
 tel:    +351-21 793 35 19
 fax:    +351-21 797 77 59
 e-mail: a.emiliano at mail.telepac.pt

 Centro de Linguistica da Universidade Nova de Lisboa
   http://www.fcsh.unl.pt/hp/unidades/cecllm.htm

 Nucleo Cientifico de Estudos Medievais
   http://www.fcsh.unl.pt/hp/unidades/ncem/index.html

        "lc mann the wisdom lufath bith geslig"
                  = lfric of Eynsham =

- --------------------------------------------------------

> Subject: Missing Language
> To: John Clews <Endanger at sesame.demon.co.uk>
> Bcc:
> From: damonj at unk.edu
> Date: Fri, 4 Feb 2000 13:10:55 -0600

A language missing on your list is O'odham (formerly known as
Papago), a language of southwestern Arizona. Its broader grouping is
Piman, a Uto-Aztecan branch.

John Damon
University of Nebraska at Kearney
Kearney, NE 68849-1320
damonj at unk.edu

- ----------------------------------------------------------

> To: SLLING-L at ADMIN.HUMBERC.ON.CA
> cc: Endanger at sesame.demon.co.uk
> Message-ID: <8525687B.0071D9EE.00 at notes-mta.dragonsys.com>
> Date: Fri, 4 Feb 2000 15:52:31 -0500
> Subject: Languages listed in ISO 639: feedback sought

>>From LINGUIST List #11-230
Please send replies not to me but to
   John Clews <Endanger at sesame.demon.co.uk>
Do not simply post them to SLLING-L; he does not read it.

I attach an extract from a request for comments on and contributions to a list
of codes for representing the names of languages, ISO [International Standards
Organization] Standard #639. It was posted on the LINGUIST List. You can find
the full text on the Web at
     http://linguistlist.org/issues/11/11-230.html

The reason I am posting it here is that the only mention of sign languages in
the list is a single item, indicating that the Library of Congress uses the
3-letter code "sgn" for "Sign languages", without further distinction between
them, and that neither of the existing versions of the ISO standard has
anything
at all for sign languages. (Please do not blame or flame Mr. [Dr.?] Clews.
He is
not responsible for this list!)

I have written to him as follows:

     "Sign languages (not expanded further)" is about as acceptable as
     would be "Asian languages (not expanded further)". I will start
     with proposing asl American Sign Language and continue by
     posting a message about your request on SLLING-L, the Sign
     Language Linguistics List, to elicit contributions from sign
     linguists familiar with others of the dozens or hundred-plus of
     known sign languages in the world.

I strongly suggest to sign linguists who are specialists in or
familiar with other sign languages that you email your suggestions
for codes for those languages as soon as possible. The Joint Advisory
Committee on ISO 639: Codes for representation of names of languages
will be meeting in Washington, DC, February 17-18, and I suspect that
information should be fed to them well in advance of those dates.

I also suggest that you look at the web site listed above for the
existing list of language codes. It is possible that the usual sign
linguists' abbreviation for a particular SL is already established in
use for a spoken language, and some alternative code will have to be
found for the SL. This has already been done with many spoken
languages in the list, such as this set:

............................................................
  LC  ISO 639-2   ISO 639-1  Language name in English
............................................................
      ara          ar        Arabic
      arc                    Aramaic
      arp                    Arapaho
      arn                    Araucanian (Mapuche)
      arw                    Arawak

A number of entries on the list refer to sets of languages, but almost all of
these entries are for sets of *related* languages, such as

      apa                    Apache languages

While it might make sense to have a listing for "French Sign Language
and related SLs", each of those languages should also be listed; and
other SLs, such as Japanese and Hong Kong SLs, could not be included
in it.

The main purpose of this code is for use in computer systems. While
most sign languages have no written form, that situation is changing
rapidly with systems like SignWriting that are intended for signers
to use. There are also systems like Stokoe notation and HamNoSys that
are used by sign linguists.

Sincerely,
Mark Mandel

   Mark A. Mandel : Senior Linguist and Manager of Acoustic Data
         Mark_Mandel at dragonsys.com : Dragon Systems, Inc.
 320 Nevada St., Newton, MA 02460, USA : http://www.dragonsys.com/
                     (speaking for myself)

- ----------------------------------------------------------

> From: Rna8arnold at aol.com
> Message-ID: <cc.12d30f9.25cca335 at aol.com>
> Date: Fri, 4 Feb 2000 16:48:37 EST
> Subject: Re: Languages listed in ISO 639:
> To: SLLING-L at admin.humberc.on.ca
> CC: Endanger at sesame.demon.co.uk

There is a problem with using asl for American Sign Languages in the
ISO 639 that leaves Australian Sign Language in a limbo.... can't use
"asl", can't use '"aus" (it for native Australian languages).

How are we going to resolve this??    "ams" would be OK for ASL (I hope...).

Another solution would be to start with an "s" to denotate a signed language
then two other letters to make it unique.

Thus:     sam         - for ASL (um... gives a new twist on Uncle Sam)
          sau         - for Australian SL
          sot/soe/sor - for Austrian SL (Osterreich GS)
          snz         - for NZSL
          snd         - for Nederland SL (Holland)
          sdg         - DGS (German Sign Language)

Remember these are just codes for computer/internet usage and not
neccessarily for academic usage .... These codes would be used for a way to
detect font types for the languages concerned.

My proposal is for a panel of SL experts (who are familiar with computer
systems) to consult with ISO over this.

Richard Arnold

PS I have forwarded this to John Clews as well, but I thought to let you guys
know and get us discussing this issue.

- ----------------------------------------------------------

> From: Rna8arnold at aol.com
> Message-ID: <dd.ecbc6f.25cc9ee8 at aol.com>
> Date: Fri, 4 Feb 2000 16:30:16 EST
> Subject: re: ISO Standard codes for Sign Languages
> To: Endanger at sesame.demon.co.uk

This is in response to the upcoming ISO standard codes for languages.

>>From the SLLING_L  list email:

<<The main purpose of this code is for use in computer systems. While
most sign languages have no written form, that situation is changing
rapidly with systems like SignWriting that are intended for signers
to use. There are also systems like Stokoe notation and HamNoSys that
are used by sign linguists.>>

They could incorporate the following codes for these three notation (written)
systems:

SGW           SignWriting
STK/SKE       Stokoe notation for sign languages
HNS           Hamburg Notation System for Sign Languages

For some Sign Languages itself:

NZL          New Zealand Sign Language

- ----------------------------------------------------------

> Date: Fri, 4 Feb 2000 14:23:30 -0800
> To: John Clews <Endanger at sesame.demon.co.uk>
> From: Valerie Sutton <Sutton at SignWriting.org>
> Subject: sign language codes
> Cc: everson at INDIGO.IE

February 4, 2000

Dear John -

I have received a whole bunch of messages in the past half hour about
a message you posted to the Linguist's List (smile...isn't the
internet something?!)

I am not a member of the Linguists List...so I could not post this
response myself, although a friend may do it for me later...and I did
write to the Sign Language Linguists List (SLLING) a few minutes ago,
and I sent you a copy of that message....

Meanwhile, I wanted to write to you personally. I am sending a copy
of this message to Michael Everson, because Michael wrote and
submitted an application to the Registration Authority on our behalf
last September.

You can read about the application on this web page:

International Organization for Standardization (ISO)
...application for language codes for Sign Languages...
http://www.indigo.ie/egt/standards/iso639/sign-language.html

We are already using these codes for signed languages in the
SignWriter 5.0 computer program, typing signed languages from 18
countries in SignWriting, and they are working well. They are easy to
recognize in the java source code...and our programmer likes them
very much....

There were many arguments as to what "three letter codes" to use for
signed languages....we discussed it for weeks on the SignWriting
List...since obviously ASL could also be Austrian Sign Language and
so forth...and of course German Sign Language is not GSL, because it
is DGS in Germany and so it should be - since that is the terminology
they use in Germany!

So finally, we placed "sgn" connected with the country code plus the
region code of the country - so in other words:

sgn.DK

....means the signed language used in Denmark....

And if there are dialects...

sgn-ES-CT

stands for the signed language used in Espana (Spain) in the
Catalonian region...etc...so that differentiates it from the signed
language used in Madrid.

In other words it is pretty neutral, since the country or region code
already established for the country or region is attached to the
general three letter code "sgn" for sign language.

The reason this works for computer programmers is that they already
know the code "DK" for Denmark, so attaching "sgn" to "dk" makes
sense that it is the signed language used in Denmark...

And there is much more detail to this..Michael Everson hit upon an
excellent way to determine "Signed Danish" versus Danish Sign
Language:

sgn-dan-DK

means sign-Danish-Denmark

The three letter code "dan" is known for the spoken language of
Denmark...so that is Signed Danish, since it is connected with spoken
Danish...

Just wanted to let you know, John, and thanks so much for caring
about signed languages!!

Valerie Sutton
mailto:Sutton at SignWriting.org

- ----------------------------------------------------------

> Date: Fri, 4 Feb 2000 15:56:51 -0700 (MST)
> From: "Angus B. Grieve-Smith" <grvsmth at unm.edu>
> To: Endanger at sesame.demon.co.uk
> Subject: Obvious omissions from ISO 639

I noticed that there are no signed languages in your list. As many
signed languages are now being written on paper and by computer, this
is an important issue. You can get a sense of the number of signed
languages currently being written from <http://www.signwriting.org>.

        Thanks for asking.

                                -Angus B. Grieve-Smith
                                Linguistics Department
                                The University of New Mexico
                                grvsmth at unm.edu

- ----------------------------------------------------------

> From: Sebasti Pla <sastia at retemail.es>
> Organization: Blackadder & Co.
> To: Endanger at sesame.demon.co.uk
> Subject: Valencian = Catalan
> Date: Sat, 5 Feb 2000 01:11:05 +0100

Hi John

I've read the list of languages for the ISO 639 standard. You
include Valencian, marked with --- --- --- (??). There is
not a Valencian language. It is the name which the
people in the Valencian country give to the variant of the
catalan language they speak. There is a movement claiming it=20
is a different language, but this movement is inspired by
political reasons, without any ground on real language.

By the way, I'm valencian, and of course I speak catalan.

If you want further information, feel free to contact me.

Best regards. Sebasti=E0 Pla.

-
History shows that people who don't value freedom
enough to defend it will tend to lose it.=20
=09=09=09      Richard M. Stallman

- ----------------------------------------------------------

> Date: Fri, 4 Feb 2000 16:55:30 -0500 (EST)
> From: Martin Jansche <jansche at ling.ohio-state.edu>
> Reply-To: Martin Jansche <jansche at ling.ohio-state.edu>
> To: John Clews <Endanger at sesame.demon.co.uk>
> Subject: Re: 11.230, Qs: Feedback sought on languages listed in ISO 639

Dear John Clews,

This is in response to a posting on the LINGUIST mailing list.

On 3 Feb 2000, The LINGUIST Network wrote:

> LINGUIST List:  Vol-11-230. Thu Feb 3 2000. ISSN: 1068-4875.

> If possible could you embed your comments within my quoted table,
> unless your comment is very simple on a few lines: that will enable
> me to allign comments.

I'm repeating the entire list, with some short comments appended to
the appropriate lines, so you'd have to do a diff to get at them.
Others are interspersed on separate lines.  The usual disclaimers
apply.

A huge problem is getting the level of granularity right.  The genetic
codes range from Indo-European down to Germanic, and similar
distinctions could be made in other families (Malayo-Polynesian under
Austronesian, etc.).  A hierarchical, extensible standard seems to be
more appropriate in the long run, but I realize that this is not the
right place and time now.

I'd very much like to see codes for the various Chinese
languages/dialects in place.  I realize this is a politically
sensitive issue, but politics just has to come to grips with reality
sometimes.

Thank you very much.
Sincerely,

- martin jansche

>       ara          ar        Arabic

Moroccan, Egyptian, Classical, ... Arabic

>       aus                    Australian languages

Perhaps "Pama-Nyungan" instead, depending on what specific languages
it is referring to.

[add]

        can/yue                Cantonese (Chinese)

>       chi/zho *    zh        Chinese

replace with: Chinese languages

[add]

        zht                    Middle Chinese (Sui, Tang Dynasties)
        zhz                    Old Chinese (Zhou Dynasty)

[add]

Dyirbal, Jirrbal (Pama-Nyungan)

[add]

        flm                    Flemish

[add]

        gan                    Gan (Chinese)

[add]

        hac                    Haitian Creole

[add]

        hak                    Hakka, Kejia-Hua (Chinese)

[add]

Kartvelian languages (Other)

[add]

        bfh                    Mandarin, Beifang-Hua (Chinese)

[add]

Miao (or is that subsumed by Yao?)

[add]

        mnb/fzh                Min-Bei, Fuzhou-hua (Chinese)
        mnn/twn                Min-Nan, Taiwanese (Chinese)

[add]

        nad/den                Na-Dene, Dene

[add]

Okinawan

[add]

        pth                    Putonghua (Chinese) [NB separate from Mandarin]

[add]

Sundanese (Austronesian)

[add]

Vulgar Latin

[add]

Warlpiri (Walpiri, Walbiri), Australia

[add]

        wuc                    Wu, "Shanghainese" (Chinese)

[add]

        xia                    Xiang (Chinese)

- ----------------------------------------------------------

> From: Claudiu Costin <claudiuc at interplus.ro>
> Reply-To: claudiuc at interplus.ro
> To: Endanger at sesame.demon.co.uk
> Subject: [ISO639 issue] Romanian language is good for "rom"
> Date: Sat, 5 Feb 2000 02:26:54 +0200

               Dear John,

>     rum/ron *    ro        Romanian =20
> --- --- ---     (ry)       Romany; Romani
>     rom                    Romany

   Please note:
   1) "rom" is more apropiate for Romanian language.
   2) Is unfair to assign "rom" to "romany" (or "romano") which is
   gipsy language.=20

   If insist to remain tracked on "romany" denomination please note
   that _REAL_ name will be like "rommany" because in Romanian
   Academy, mass-media, gipsy books, gipsy people claim to be named
   "romm" and not "=FEigan" (gitanes in spanish) because in Romania
   it's almost a shame to be blessed like above (due to very high
   unsocial acts and behaviour).=20

   I don't like to say these, but the reality is more than that.
   That's why after 1989 (after 17 December Romanian Revolution)
   appeared the term romm, and the language "romany".

   These was boned from intellectual Gipsy people (which have good
   reputation) which try to "kill" pejorative denomination "=FEigan"
   (this quoted text is in ISO8859-2. You may read "tzeegun").

  CONCLUSIONS:

  1) Make "rom" for Romanian language
  2) Change needed for Romany. It will be nice
     to be "rmy". You have more experience and
     you can choose better option.
  3) Please contact me, tell me your opinion
     and if Romania have representants in ISO639
     commission. If not I will try to make some
     waves to change this.

regards,
- =20
o-------------------------------------------o
Claudiu COSTIN       claudiuc at interplus.ro
sysad ADComm.pub.ro  claudiuc at geocities.com
Linux-KDE Romania    http://www.ro.kde.org
Home page http://lion.ADComm.pub.ro/~claudiuc

- ----------------------------------------------------------

> Date: Fri, 4 Feb 2000 14:10:08 -0800
> To: "For the discussion of linguistics and signed languages."
>  <SLLING-L at ADMIN.HUMBERC.ON.CA>
> From: SignWriting <DAC at SignWriting.org>
> Subject: Re: Languages listed in ISO 639: feedback sought
> Cc: Endanger at sesame.demon.co.uk

February 4, 2000

Dear SLLING Members....
I noticed Mark's excellent message about recognition of signed
languages in computer codes...

Thank you, Mark, for bringing this issue to everyone's attention..

As you all know, computer programmers refer to the "ISO 639-2
Registration Authority" for the standard codes used to represent the
world's languages. This helps standardize software development.

In September, 1999, the Deaf Action Committee for SignWriting (the
DAC), and the Irish National Body applied to the Registration
Authority, with the help of Unicode specialist Michael Everson,
requesting that the world's Sign Languages be included. The
application is currently waiting for approval. It was supposed to be
decided upon last November, and then the meeting was postponed until
this month. I believe it will be voted on in the next two weeks.

You can read about the application to the ISO on these web pages:

International Organization for Standardization (ISO)
...application for language codes for Sign Languages...
http://www.indigo.ie/egt/standards/iso639/sign-language.html

Recognition of Signed Languages
http://www.SignWriting.org/unicod01.html

We are already using these codes for signed languages in the
SignWriter 5.0 computer program, typing signed languages from 18
countries in SignWriting, and they are working well. They are easy to
recognize in the java source code...and our programmer likes them
very much....

There were many arguments as to what "three letter codes" to use for
signed languages....we discussed it for weeks on the SignWriting
List...since obviously ASL could also be Austrian Sign Language and
so forth...and of course German Sign Language is not GSL, because it
is DGS in Germany and so it should be - since that is the terminology
they use in Germany!

So finally, we placed "sgn" connnected with the country code plus the
region code of the country - so in other words:

sgn.DK

....means the signed language used in Denmark....

And if there are dialects...

sgn-ES-CT

stands for the signed language used in Espana (Spain) in the
Catalonian region...etc...so that differentiates it from the signed
language used in Madrid.

In other words it is pretty neutral, since the country or region code
already established for the country or region is attached to the
general three letter code "sgn" for sign language.

The reason this works for computer programmers is that they already
know the code "DK" for Denmark, so attaching "sgn" to "dk" makes
sense that it is the signed language used in Denmark...

And there is much more detail to this..Michael Everson hit upon an
excellent way to determine "Signed Danish" versus Danish Sign
Language:

sgn-dan-DK

means sign-Danish-Denmark

The three letter code "dan" is known for the spoken language of
Denmark...so that is Signed Danish, since it is connected with spoken
Danish...

Hope this helps a little!

Valerie Sutton
mailto:Sutton at SignWriting.org

- ----------------------------------------------------------

> From: Claudiu Costin <claudiuc at interplus.ro>
> Reply-To: claudiuc at interplus.ro
> To: Endanger at sesame.demon.co.uk
> Subject: [ISO639 issue] Moldavia have language Romanian; Moldavian is a
>dialect
> Date: Sat, 5 Feb 2000 01:03:36 +0200

          Dear John,

   Please note that does not exist Moldavian language. It is a
   regional dialect. So, Moldavia is old romanian region which was
   hijack by communist Russian ago 50 years.=20

**************************************************************
  The people language is Romanian and with dialect Moldavian
**************************************************************

  We have many dialects in Romania (I don't know english translating):

  - muntean
  - ardelean
  - moldovenesc   <--- ("moldavian" in english)
  - arom=E2n

  All these mean Romanian language. There is no Moldavian language.
  Just a silly historical & political situation that make my=20
  country to by divided in two parts.

  Moldavian education ministery have empowered all official and
  educational resources to skip from very high russification led by
  KGB to Romanian language (note this! Romanian not Moldavian). This
  was since 1994 if I recall corectly.

  My current personal observations (made at National Moldavia TV) is
  that official Romanian language (aka ~90% "muntean") is speak and
  write everywhere.

 Conclusion:
 1) Correct Moldavia language to "rom".
 2) Please inform me if you have any doubt

regards,

o-------------------------------------------o
Claudiu COSTIN       claudiuc at interplus.ro
sysad ADComm.pub.ro  claudiuc at geocities.com
Linux-KDE Romania    http://www.ro.kde.org
Home page http://lion.ADComm.pub.ro/~claudiuc

- ----------------------------------------------------------

Note: several of the replies which follow are a group mainly relating
to South Asian languages

- ----------------------------------------------------------

> Date:         Thu, 3 Feb 2000 05:29:53 -0800
> Reply-To:     South Asian Linguists <VYAKARAN at LISTSERV.SYR.EDU>
> Sender:       South Asian Linguists <VYAKARAN at LISTSERV.SYR.EDU>
> From:         Peter Claus <pclaus at CSUHAYWARD.EDU>
> Organization: California State University, Hayward
> Subject:      Re: Indian and other Asian languages listed in ISO 639:
>               feedbacksought
> To:           VYAKARAN at LISTSERV.SYR.EDU

VYAKARAN: South Asian Languages and Linguistics Net
Editors:  Tej K. Bhatia, Syracuse University, New York
          John Peterson, University of Munich, Germany
Details:  Send email to listserv at listserv.syr.edu and say: INFO VYAKARAN
Subscribe:Send email to listserv at listserv.syr.edu and say:
          SUBSCRIBE VYAKARAN FIRST_NAME LAST_NAME
          (Substitute your real name for first_name last_name)
Archives: http://listserv.syr.edu

Dear John,

For the state of Karnataka, India, there are at least three major
languages left off. All have literatures and scholars working in and on
them: Tulu, Badaga, and Kodagu.

Tulu, a Dravidian language, in particular, has a large number of
speakers and a large amount of scholarship devoted to it. It also has at
least two phonemes which are not shared with other Dravidian languages.
There is a great need for a standard transliteration scheme since much
of the scholarship includes a large amount of transcribed oral textual
material and many people (myself included) would like to put translated
text on the internet.

Kodagu, at present, has a smaller literature and a smaller number of
scholars working on it, but enough for consideration as a significant
Indian language.

Badaga may not have its own literature, but there are scholars working
on this language and there are oral texts which have been collected and
transliterated.

Toda, Kota, and Kuruba (maybe several separate languages), found along
the border of Karnataka and Tamil Nadu, should also be included, since
their phonemic systems are distinctive and there is a fair amount of
scholarship on them, both past and present.

Please contact Ulrich Demmer (t45 at ix.urz.uni-heidelberg.de) or Gail
Coelho (gail at utxvms.cc.utexas.edu) for the internal differentiation
within the Kuruba group of languages.

Peter Claus

- ----------------------------------------------------------

> Date: Thu, 3 Feb 2000 08:49:25 -0600
> To: John Clews <Emeet at SESAME.DEMON.CO.UK>
> From: John Clews <Emeet at SESAME.DEMON.CO.UK> (by way of Hans Henrich Hock)
> Subject: Indian and other Asian languages listed in ISO 639: feedback
>           sought

Dear Colleague,

The usual abbreviation for _Sanskrit_ in the fields of Linguistics
and Indology is _Skt_.

Best wishes,

Hans Henrich Hock
Professor of Linguistics and Sanskrit
Linguistics, 4088 FLB MC-168, University of Illinois
707 S. Mathews, Urbana IL 61801-3652
telephone: (217) 333-0357 or 333-3563 (messages)
e-mail: hhhock at staff.uiuc.edu
fax: (217) 333-3466

- ----------------------------------------------------------

> Date: Thu, 3 Feb 2000 09:38:25 -0600
> To: Emeet at sesame.demon.co.uk
> From: Mark Southern <m.southern at mail.utexas.edu>

Dear John,

re your recent VYAKARAN appeal for info, esp. omissions / language names:

A lot of 1. North-Cent. Amer. / 2. Australian / 3. African languages seem
to be left out - possibly caught by the SIL codes?

e.g.:
1. Mixtec, Tarascan, Tuscarora, Hopi
2. Warlpiri, Dyirbal
3. Fe?Fe? Bamileke, Damara

and in Austronesian: Mentawai
(btw the Polynesian lg. Truk, whcih you have ??? beside, is usually called
Trukese)

Asia: Samoyed, Ainu, Chukchi, Yukaghir, Burushaski

Mark S.

Mark Southern
Dept. of Germanic Studies
EPS 3.102
University of Texas
Austin, TX 78712
512-232-6371

- ----------------------------------------------------------

> Date: Thu, 3 Feb 2000 11:07:16 -0500 (EST)
> From: "E. Bashir" <ebashir at umich.edu>
> X-Sender: ebashir at seawolf.gpcc.itd.umich.edu
> To: John Clews <Emeet at SESAME.DEMON.CO.UK>
> Subject: Re: Indian and other Asian languages listed in ISO 639: feedback
>             sought

Dear John,

There follow names of some languages which are not included in your list.
Most of these are spoken in Pakistan; two of them are (probably) extinct
by now (Tirahi, Wotapur-Qatarqalai).  Most of these are names of important
languages, having numerous sub-dialects (usually named for the village or
region where they are spoken).  Siraiki and Hindko are two names for
important variants of western Panjabi which are often discussed under the
names given, particularly in the context of sociolinguistic or language
planning issues.

Just a thought:  instead of relying on input from list-members, which may
be spotty and miss many things, why not consult a standard work on the
languages of the world, for example Ruhlen's Guide to the Languages of the
World (Ruhlen, Merritt. 1987.  Guide to the Languages of the World.
Stanford:  Stanford University Press)?

Language                Genetic grouping        Suggested code (by me,
                                                have not checked
                                                Ethnologue codes)

Balti                   Tibeto-Burman           blt
Brahui                  Dravidian               brh
Brokskat                Indo-Aryan (Dardic)     bro
Burushaski              Isolate                 brs
Dameli                  Indo-Aryan (Dardic)     dam
Domaki                  Indo-Aryan              dom
Gawarbati               Indo-Aryan (Dardic)     gaw
Gojri                   Indo-Aryan              goj
Grangali                Indo-Aryan (Dardic)     gra
Hindko                  Indo-Aryan              hnk
Ishkashmi               Iranian                 ish
Kalasha                 Indo-Aryan (Dardic)     kls
Kanyawali               Indo-Aryan (Dardic)     kan
Khowar                  Indo-Aryan (Dardic)     khw
Kohistani               Indo-Aryan (Dardic)     koh
Palula                  Indo-Aryan (Dardic)     pll
Pashai                  Indo-Aryan (Dardic)     psh
Sawi                    Indo-Aryan (Dardic)     saw
Shina                   Indo-Aryan (Dardic)     shi
Siraiki                 Indo-Aryan              sir
Shumashti               Indo-Aryan (Dardic)     shu
Tirahi                  Indo-Aryan (Dardic)     trh
Torwali                 Indo-Aryan (Dardic)     tor
Wakhi                   Iranian                 wkh
Wotapur-Qatarqalai      Indo-Aryan (Dardic)     wot
Yazghulami              Iranian                 yaz
Yidgah                  Iranian                 ydg
Zebaki                  Iranian                 zeb

Regards,

Elena Bashir

**************************************************************************
Elena Bashir, Ph.D.                     3070 Frieze Bldg.
Lecturer in Urdu and Hindi              The University of Michigan
Dept. of Asian Languages and Cultures   Ann Arbor, MI 48109
                        Phone:  734-763-9178
                        Dept. Phone:  734-764-8286 (messages only)
                        Fax:  734-647-0157
**************************************************************************

- ----------------------------------------------------------

> Date:         Fri, 4 Feb 2000 03:13:56 -0500
> Reply-To:     South Asian Linguists <VYAKARAN at LISTSERV.SYR.EDU>
> Sender:       South Asian Linguists <VYAKARAN at LISTSERV.SYR.EDU>
> From:         Peter Hook <pehook at UMICH.EDU>
> Subject:      Indian languages to be listed
> X-cc:         Emeet at sesame.demon.co.uk
> To:           VYAKARAN at LISTSERV.SYR.EDU

Dear John Clews,

        There are at least 5000 named languages in the world.  Will a
3-letter code be able to cover them all?  Mathematically possible, yes,
but many of the 3 letter sequences (like QQQ) will not be helpful if there
has to be a relationship between the language name and the abbreviation.

        In any case, I would like to suggest adding a few more from India
and Pakistan:

        1. Poguli (POG?): spoken in Kashmir.  (See my Webpage
http://www-personal.umich.edu/~pehook/index.html for a link to more
information on Poguli.)

        2. Bangani (BAN?): spoken in Uttar Pradesh.  Bangani is much in
the South Asian linguistics news lately, as it is purported to have kentum
vocabulary in it.  (See my Webpage for a link to a Bangani page.)

        3. Garhwali (GAR?) is spoken by about 2 million people in Uttar
Pradesh.

        4. Shina (SHN?) is spoken all over the Northern Areas of Pakistan
and in several places in western Kashmir.

        5. For many other languages spoken in Pakistan and Afghanistan
please see Richard Strand's elaborate Webpage on Nuristan.  A link to it
is available from my page.

        I won't continue because you may have some conditions in mind that
render these suggestions pointless.  But a glance at Grierson's Linguistic
Survey of India will illustrate the problem of trying to be exhaustive.

                Sincerely,

                        Peter Hook

                http://www-personal.umich.edu/~pehook/index.html

- ----------------------------------------------------------

-
John Clews, SESAME Computer Projects, 8 Avenue Rd, Harrogate, HG2 7PG
tel: 0171 412 7826 (day); 0171 272 8397 (evening); 01423 888 432 (w/e)
Email: Emeet at sesame.demon.co.uk

Committee Chair of  ISO/TC46/SC2: Conversion of Written Languages;
Committee Member of ISO/IEC/JTC1/SC22/WG20: Internationalization;
Committee Member of CEN/TC304: Information and Communications
 Technologies: European Localization Requirements
Committee Member of the Foundation for Endangered Languages;
Committee Member of ISO/IEC/JTC1/SC2: Coded Character Sets

---------------------------------------------------------------------------
LINGUIST List: Vol-11-294