ISO character coding
Mike Cleven
ironmtn at BIGFOOT.COM
Wed Feb 16 07:35:22 UTC 2000
Michael Everson wrote:
>
> >> From: David Robertson <drobert at tincan.tincan.org>
> >> Subject: ISO 639 & ChInuk (Chinook Jargon)
> >
> >It's good to see that there is a code, chn, for ChInuk (Chinook Jargon).
> >
> >I want to let you know that if and when the ISO get down to the process of
> >establishing a standard character-set for this language, I and the CHINOOK
> >list stand ready to advise and help.
>
> Your letter has reached me, who am the person most likely to be interested.
> I'm Irish national representative to ISO/IEC JTC1/SC2/WG2, and work closely
> with the Unicode Technical Committee on encoding scripts in the UCS. See my
> web page under "Standardization" for examples of my work on the Universal
> Character Set.
>
> I took a major part in finalizing the encoding of Canadian Syllabics and of
> Cherokee in the UCS.
Don't know if you've seen Marv Plunkett's Cherokee tutorials; if not
http://www.intertribal.net/Dloads/Downloads.htm
>
> >ChInuk is fortunate in that quite a few technically savvy people are
> >involved in the preservation and dissemination of the language. In fact,
> >we've discussed ISO before on our list. Between the linguists and
> >computer science people in our ranks, we can provide good feedback to ISO
> >if, as we hope, we are called upon.
>
> Chinook is allocated some space in the Roadmap for Plane 1 of the UCS,
> about 3 columns I think, but myself I have no reliable information on the
> writing requirements of Chinook, so....
>
> ...you are hereby called upon.
Well, a few other people have taken their stab at this since it arrived
in the list; I've been pondering it before rambling into it, and I note
that the idea that there are _three_ different forms to consider, and
all of them must perhaps be considered as equally valid; can one
language have three separate ISO standards? Each of the three has its
own legitimacy as relevant both to the history and use of the Jargon,
although their development as _standards_ will change and formalize them
somewhat.
The three of course are the Grand Ronde ASCII rendering of the IPA, the
Kamloops Wawa Duployan (Pitman) shorthand, and the "traditional"
latinization found in the historic lexicons, which for terms of
convenience I'll name the "ideomatic" form of the Jargon (for reasons
I'll try to make clear later) and represents the once-most-widely spoken
form of the Jargon. The first two are somewhat "dialect"-specific,
representing particular ways the Jargon was spoken, and both
representing native communities for whom the Jargon remained vibrant;
the Kamloops Wawa version is (to me) actually a subset of the
"ideomatic" version embodied in the lexicons and other historic
documents (including the KW). The "ideomatic" form is the one best
known to non-natives both historically and in terms of modern awareness
of the Jargon; but it is only a subset of the broader vocabulary and
richer ideom of Grand Ronde. I think all three need to be represented
as valid standards, each with their own reasons and intertwined
relationships and issues.
1) Lower Columbia/Grand Ronde Jargon are much different from the ways
that Jargon was used and pronounced outside of that region/community.
This is the only truly surviving Jargon-using community, other than our
electronic wawa-tillikum illahee such as has lately emerged, and as such
it includes both words and sounds not found in other regions of the
Jargon's usage - including the Central Interior form embodied in the
Kamloops Wawa. I think in Grand Ronde Jargon's case, though, there's
also a regular latinization, isn't there, Tony/Dave? Other than regular
IPA, I mean....so GR Wawa would itself need two ISO standards, wouldn't
it?
2) the Kamloops Wawa Duployan script must have been something of a
standardization itself, given the variations to be expected between
Secwepemc, Nlaka'pamux, Nicola (Sce'emx/Spaxomin), Okanagan and
Stl'atl'imx readers of the Wawa; a standardization derived from the
Oblates, and probably representing (Zvjezdana's the expert on this,
though) the dominant Secwepemc of the region served by the school, and
of Kamloops area itself; still, there were a lot of Nlaka'pamux and
Stl'atl'imx at Kamloops; I'm not sure about the Okanagan. IIRC there
were also Ktunaxa, Sinixt, Sto:lo and Carrier students there. I'm of
the impression that the French background of the fathers also had a hand
in shaping the Kamloops Wawa writing system; I know Stl'atl'imx users of
the word given as "aias" said "hyash", for instance - I've always
suspected that the French habit of glottal-stopping an initial 'h' is
really what's responsible for this and other similar spellings, which
are often quite far from what's shown in Gibbs, Shaw, et al, as well as
being obviously different from the word-versions represented by the
often-wacky spellings that turn up in traveller's journals, company and
government records, etc. that were current in the same regions of BC as
rendered by non-Oblate non-natives (whew, now that's a negative
qualifier, enit?). Yet the vocabularly and syntax of the Kamloops Wawa
are pretty much the same as the other non-Grand Ronde forms of the
Jargon, despite the different prononciation represented in the Kamloops
Wawa publications.
What I think might have to happen for the Duployan script to be useful
is for some kind of reform; using the idea of the script and the
word-glyphs that result being redesigned to reflect the "broader" form
of the Jargon. There may be ways to adapt it to Grand Ronde
prononciation as well, although I understand that attempts to use the
Duployan/Pitman sound-writing symbols for more complex phonologies like
Secwepemc and St'at'imcets haven't worked out quite well, and natives
themselves remained unpersuaded of the utility of the system for their
traditional languages even when proposed by one of their own people.
Apparently in modern times a Stl'atl'imx national from Semahquam (now
In-SHUCK-ch, actually, since a governmental rearrangement) tried
proposing to that nation's language authorities the adoption of the
Duployan; but it just wasn't adequate to represent the subtle
sound-system of St'at'imcets; they have since adopted a Latin system
with complex diacriticals (can't remember whose, but it's an established
linguist's system). The Kamloops Wawa publication also has examples of
Secwepemc and Carrier and other languages, but I understand these are
somewhat hard to decipher, representing only what could be represented,
much as older Latin renderings of the same tongues also fail.
I actually think there's a case to be made for integration of the KW
script with the ideomatic system, in a way that can also be functional
for Grand Ronde in ways I'll get to later; what I mean is that standard
word forms are come up with even if they are pronounced different
locally/personally. One essential reason this integration is possible
is because the version of the Jargon represented by the Kamloops Wawa
-is- a subset of the larger regional "ideomatic" form that was spoken in
varying ways across the region beyond the Lower Columbia/Grand Ronde.
But more importantly it has to do with the flexibility of the Duployan
system viewed not as a spelling-alphabet by as a visual symbol-mode to
create standard word-glyphs, i.e. using the symbol-system of the
shorthand to create the new glyphs, rather than adopting outright the
Kamloops Wawa glyph-forms. By "re-spelling" the script into
standardized word-symbols, the regional and personal variations between
tamahnous, tamanahwis, tamanass, etc. can be embraced by a single,
instantly recognizable glyph; for different words the chosen glyph could
come from any of the KW, GR or the range of ideomatic spellings found in
the lexicons. This can be extended to take in specifically Grand Ronde
forms, and can also help resolve the awkwardness of the relative
meanings of "munk" and "mamook" in GR vs. the use of "mamook" everywhere
else.....
Dave Robertson noted that the Duployan shorthand has so far resisted
attempts at wrastling it into a useful computer form; both Marv Plunkett
and I have taken stabs at the problem, and I've made some progress, I
think, in notes towards a workable system which I'll bring up when I
think I've got it down. There are technical issues to do with keyboard
layouts and UNICODE renderings and such that would need to be addressed
for what I'm thinking of to work at all, and anyone using it would have
to learn a new way of typing; "short-typing" in fact. The main problem
with approaching the Duployan as an alphabet-keyboard based font is that
the words do not come in a linear string, but rather arranged as
combinations of strokes and diacriticals - rather reminiscent of Arabic,
in fact; Arabic computerization was apparently even tricker than coming
up with coding for Chinese......so the alternate is to come up with
standard glyphs, and then have those glyphs accomplished by a series of
combo-keystrokes (this is how some Chinese and Japanese screen fonts are
written, from what I understand). Takes getting used to, but given that
you only need one or two combinations of keys to write one word, once
you get used to it you'd type at thinking speed. I think it's
interesting; I'm only proposing it, and I've been working on the
technical issues as much as I can address, but I think it's not only
workable it'll also be quite useful.....In that the glyphs would be
composed of phonological symbols, they wouldn't exactly be ideograms
although as a written script they would function in a similar fashion.
The Korean Hangul script is also glyph-character oriented, although it's
composed of a strict phonological sound system. Hmmmm. Might be fun to
try rendering the Jargon into Hangul someday..... ;-)
3) Now about this "ideomatic" form I've given name to, if it's to be
represented by any kind of ISO standard most of all needs some
standardization, since there are many variations for several important
words, as well as a wide range of local words and unusual word-forms and
alternate usages. What I mean by choosing "ideomatic" as a term here is
that it is the _ideom_ of the Jargon that is important, not its
phonology or the user's choice of word-variations according to his
origin/preference. Despite their differences, Georgia Strait speakers
of the Wawa could readily communicate with Cariboo or Boundary Country
peoples (if they needed to), and clumsy-mouthed non-native speakers
could still communicate with the more complex-sounding native speakers
even though their prononciations were wildly different. Aias, hyas,
hayash - all intelligible to each other in speech, if not so much
visually. Danes and Norwegians, for instance, are used to discerning
the differences between their tongues and adjusting (igge vs. ikke for
no/not) and also recognizing differing words and wordforms, and so
therefore easily (more or less) read each other's newspapers and novels.
Our group has adopted a casual convention of using whatever one's
preferred mode of spelling is (except of course we can't use Duployan in
email even if we were familiar with it!); this ranges from personal
preference to regional relevance; I do my best with the IPA-ASCII but I
basically only scan it looking for familiar word-forms; I know that's
sort of cheating, but in a way I'm just looking at it as (wait for
it.....) shorthand. I recognize the Grand Ronde forms of words vs. my
own usages (munk/mamook, ya/yaka, etc.) and don't pause to "hear" the
plosives and fricatives that I know are being representedj; with what I
can read of the Kamloops Wawa script I do much the same thing; the glyph
for "aias" I read automatically now as "hyas" (GR hayash); yet it's
supposed to be a purely sound-based system. I think this is because, as
in Latin-written languages, we learn to scan for whole words as
concepts, and we hear _our_ prononciation of those words as we read
them; the written standardization does not reflect the span of different
prononciations and styles of speech. This is why I think the glyphic
system might be relevant as a twin or parallel standard to the
"ideomatic" spelling based in the historic latinizations that are the
stock-in-trade of the "historic Jargon". Development of the ideomatic
system as a STANDARD means that we'd have to standardize a modern form
of the Jargon by choosing one word-form over another (tamanass vs.
tamanahwis) even though individual users could still choose to "accent"
their speech the way Texans and Aussies vary from Londoners and New
Yawkers......but this standard form would be integral with the script
form, and would take in the "broader" (and to me less complicated)
non-GR form of the Jargon, which I think is most accessible to potential
users and readers.....
[OK, gang - open fire......]
> Michael Everson ** Everson Gunn Teoranta ** http://www.egt.ie
> 15 Port Chaeimhghein Íochtarach; Baile Átha Cliath 2; Éire/Ireland
> Vox +353 1 478 2597 ** Fax +353 1 478 2597 ** Mob +353 86 807 9169
> 27 Páirc an Fhéithlinn; Baile an Bhóthair; Co. Átha Cliath; Éire
Now _there's_ a language I'd love to know, less because of my 1/8 Irish
heritage than because of my exposure to Irish poetry; indeed your tongue
is sung, and subtle. I'd love to be able to read the Book of the Dun
Cow and other ancient works in the original, and of course any of Yeats'
Irish works or those of his Irish Renaissance colleagues. I know
there's more to modern Irish than the poetic tradition and the mystique
attached to "Celticism" nowadays, but it seems a beautiful tongue even
when speaking of the mundane. I trust it shall thrive in times to come
as much as a scientific and popular language even as it has flowered in
recent times as a literary one....the sound of sung Irish, well, that's
another thing again......
Mike Cleven
http://members.home.net/skookum/
http://members.home.net/cayoosh/
More information about the Chinook
mailing list