[Lexicog] RE: Query re keyboard re-mapping

Koontz John E john.koontz at COLORADO.EDU
Fri Sep 24 17:51:14 UTC 2004


On Thu, 23 Sep 2004, Peter Kirk wrote:
> >A somewhat specialized query for LexList, but the answer is "Not yet."
> >However, it would clearly be desirable to be able to jettison specialized
> >solutions in favor of general ones used by everyone.
> >
> >
> >
> Why not? I assume we are talking about Latin script.
>
> Are there characters in the Siouian fonts which are not supported by
> Unicode? Are there complex combining rules which are not supported by
> widely used rendering mechanisms? Are there characters missing from
> existing Unicode fonts - even from e.g. Doulos SIL which is supposed to
> cover Latin script completely, and Code2000 which is supposed to cover
> all of the Unicode basic plane? Unless one of these applies, there is no
> good reason not to use the Unicode fonts now, at least in Windows, Linux
> and (with the latest applications) Mac OS.

The problematic characters in modern Siouan orthographies relative to
traditional typewriter and computer fonts are combinations of the standard
vowel characters with the ogonek or nasal hook and the acute accent
individually and simultanteously.  These reflect the independence of
nasalization and accentuation in the languages.  Note that in most Siouan
languages of aeiou or aeio or aeiu only a, i and o or u can be nasalized.
There are also esh-s, etc., characters, gamma, glottal stop, and a few
others less common.

I'm using Siouan languages as an example, because I work with them
directly, but a slightly more general problem of the same sort exists with
Tanoan languages where in addition to combining ogonek with all vowels
(usually a six vowel set) one must allow for three or four different
diacritics like acute, grave and circumflex, maybe macron, to handle tone
marking.  The specifics vary with the language and the orthography.

Note that Americanists tend to use ogonek for nasalization specifically to
prevent messy diacritic combinations above the vowel.

Unicode has from its inception allowed the encoding of all of these
situations, using the base + diacritic sequences.  As compromises have
arisen in the course of the adoption of Unicode as an international
standard, some of the combinations required have also become available in
whole or in part as "precomposed" combinations assigned to particular
coding points in the Unicode character set.  Presumably a base + diacritic
encoding remains simpler and is logically preferable.  The precomposed
characters are intended mainly to satisfy the needs of certain European
languages.

However, the potential for encoding Siouan languages easily in Unicode has
not so far provided any very great temptation to do so for day to day use
because without the actual availability of technology to permit the base +
diacritic encoding to be rendered and manipulated in connection with
arbitrary text processing software in Windows and/or Unix there has been
no practical way of making day to day use of the Unicode encoding scheme.

Clearly there has always been some long term, archival benefit to using
it.  Again, in the absence of any assurance that base + diacritic
representations would ever be actually supported, there has been very
little temptation to take this route, though it has been considered.

Matters have been improving gradually improving of course, and it sounds
as if within a few years it will be actually possible to use the base +
diacritic representation with most software and so preferable to use the
Unicode representation with the Siouan languages in day to day contexts.

In the mean time, Siouanists (and Tanoanists) make do with discipline-
specific solutions using customized True Type character sets for printing
and display and customized keyboards (which would be needed in any case)
to enter them. Clearly the ability to search in terms of tools that are
vowel-diacritic aware is mostly lacking, though some software provides
ways to define abstract character matching entitites that allow this.

> The only problem I see is with accent combinations which are not
> precomposed in Unicode.

Which occur one or two to a word in Siouan and Tanoan languages.

> Although these can be rendered properly in Windows, Microsoft is holding
> back from a more general release of the Uniscribe rendering engine
> (usp10.dll) required to do this properly.

Of course, computerized work with the Siouan languages goes back to a
point when Unicode didn't exist, and Unicode's failure to date to become a
practical solution for these languages hasn't unduely harmed either work
on these languages or Unicode.  At one point it wasn't clear to me that
Unicode would ever be any more than archivally useful with them, but as it
has become an international standard in the face of initial ISO opposition
and the technology implementing "16 bit" characters and now base +
diacritic representation has actually wormed its way into the mainstream,
I've become more sanguine about this!

> I would think that if suitable pressure is applied by Native American
> communities, which I believe are receiving support from the Gates
> Foundation, Microsoft just might be persuaded to release more widely the
> updated version of Uniscribe.

These remarks will probably prove controversial!

In fact most of the pressure would be coming from the community of
students of Native American languages, which is not quite the same thing.
Native American communities are not very well organized collectively,
especially with regard to linguistic issues like language preservation and
written language.  Native American groups vary in their perception of the
utility or desirability of written as opposed to spoken language.  Some
groups are opposed to writing, though most are not, and some are very
positive about it.  A fair number have created orthographies or spread
them well in advance of any outside interest in writing their languages.

Whatever the theoretical interest or disinterest is in writing, there is
typically a certain amount of passive disinterest in writing systems,
perhaps because written systems are alien to the linguistic traditions
of most groups, perhaps because they are so aggressively associated with
the colonial adstratum, forced education, etc.

Note as one practical difficulty that the politically active elements in
most communities are English-using, and sometimes English-monolingual.  I
have heard of cases where bilingual tribal governments conducted their
business in their native language, and recorded the decisions in English.

Orthography itself can be a divisive issue.  For various reason -
including those above - many groups experience great political
difficulties in devising and adopting standard written systems.  Another
serious political difficulty not mentioned above is that many languages
are shared by several traditionally autonymous communities which today
have separate and autonymous local governments.  There is, for example, no
pan-Dakotan political entity.  So there is no pan-Dakotan entity to adopt
a standard Dakota orthography.

A cultural difficulty is that even in the case of languages that more or
less require diacritics and additional exotic characters to make use of
Latin-based orthographies there is a good deal of resistance to using
these, at least among groups accustomed to English as a written language.
The logic is that "real written languages" (English) don't need diacritics
or special characters, so their use with the native language must also be
unnecessary, unnatural and, in any event, certainly unappealing.  When you
add the historical difficulties one experiences in typing and displaying
such symbols with a typewriter or now a computer, the case is more or less
closed.  So, because there is no acceptable solution, then there can be no
solution at all.

This is a rather pessimistic assessment that overlooks the great strides
that have been taken in support of Native American native-language
literacy.  Nonetheless, it's a bit strange to one observing the process to
hear a suggestion that Native American communities should collectively
pressure Microsoft to make software more readily available that would
facilitate the use of orthographies full of diacritics!



------------------------ Yahoo! Groups Sponsor --------------------~-->
Make a clean sweep of pop-up ads. Yahoo! Companion Toolbar.
Now with Pop-Up Blocker. Get it for free!
http://us.click.yahoo.com/L5YrjA/eSIIAA/yQLSAA/HKE4lB/TM
--------------------------------------------------------------------~->


Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/lexicographylist/

<*> To unsubscribe from this group, send an email to:
    lexicographylist-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/



More information about the Lexicography mailing list