[Lexicog] Sort Order for Lakota language

Jan F. Ullrich jfu at CENTRUM.CZ
Fri May 26 07:28:29 UTC 2006


Dear lexicographers,

I would like to ask your opinion/advice on an issue that we have to decide
in our Lakhota Language dictionary project. It concerns the treatment of
digraphs in sort order. I know that this was discussed on the list not very
long ago and the generally accepted approach seemed to be that dictionaries
should be sorted by letters and not by digraphs as units.
But if I may I would like to describe the specifics of the situation for
Lakota language. Voiceless plain stops (c, k, p, t) and voiceless aspirated
stops (ch, kh, ph, th) are phonemically very salient in the language.
However, the early missionaries who were the first to put the language into
writing wrote all stops with c, k, p, t. Although there are quite a few
linguistic materials and a college level textbook (by Rood and Taylor) that
mark aspiration consistently, none of the existing dictionaries does. This
is one of the reasons why native speakers are used writing and seeing the
language written without marking aspiration.
Several years ago we, the Lakota Language Consortium (LLC, a native based
non-profit organization), have started working on a new dictionary that
would standardize an orthography using h for marking aspiration. There is
quite a lot of community support for this, but for many speakers marking
aspiration remains a very "weird thing" when they see it written.
We are now close to publishing the first Lakota dictionary with consistent
phonemic spelling and the sort order is an issue.

There are three sets of digraphs in the orthography chosen by LLC:

ch, kh, ph, th (aspirated stops)
c', k', p', t' (glottalized stops)
aN, iN, uN (nasal vowels, where 'N' stands for 'eta' in non e-mail writing)

The situation with glottalized stops is further complicated by the fact that
glottal stop frequently occurs elsewhere, where it is not a part of a
digraph.

In the discussions that we have had in our team about the sort order we have
come up with three sort order options so far:

A) sorting by all characters as individual units
B) sorting by digraphs as units
C) sorting by letters and making the "h" for aspiration "invisible" for sort
order

Here is an example to illustrate the options:

A)           B)              C)
nata         nata            nata
nathipa      natitaN         nathipa
natho        nathípa         natitaN
natitaN      natho           natho
nat'a        nat'a           nat'a


Option A) treats all characters equally in sorting. Option B) seems the most
consistent with linguistic approach, because it is based on relation of
orthography and phonology and provides the basis to introduce a sound notion
of "grapheme".
Option C) treats h-aspiration as a diacritic disregarded by the sort order
(certain diacritics seem to be treated this way in quite a few languages,
including my native language Czech, the umlaut in German, accented vowels in
French, etc.).
The strong argument for option C) is that it would allow speakers to find
words exactly where they expect them, natitan before natho as if aspiration
was not marked. But it seems to me that in the long run it might discourage
actual usage of h-aspiration in writing. And if speakers get used to writing
h-aspiration, it might seem confusing to disregard it in sort order. Not to
mention that regular computer users won't be able to sort word lists
automatically.
So it looks like option C) is advantageous for the short term acceptance of
the dictionary by native speakers, but I have my doubts that it is the best
solution in the long term? B) would probably the least acceptable option for
native speakers (all digraphs would have their own sections in the
dictionary and many words would not be where the speakers expect them). So I
guess I am in slight favor of A). It will probably cause native speakers
have problems in searching for words, but it is consistent and clear as long
as one accepts the orthography.

I would be grateful to hear your opinion or experience with similar issues.

Jan


Jan F. Ullrich
Lakota Language Consortium
www.lakhota.org
e-mail: jfu at lakhota.org, jfu at centrum.cz





------------------------ Yahoo! Groups Sponsor --------------------~--> 
You can search right from your browser? It's easy and it's free.  See how.
http://us.click.yahoo.com/_7bhrC/NGxNAA/yQLSAA/HKE4lB/TM
--------------------------------------------------------------------~-> 

 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/lexicographylist/

<*> To unsubscribe from this group, send an email to:
    lexicographylist-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/
 



More information about the Lexicography mailing list