Off-topic Re: Sorting by Case

A. Vine avine at ENG.SUN.COM
Mon Mar 13 22:28:03 UTC 2000


Michael Horlick wrote:
>
>      Even if the American Heritage dictionary catalogs words in a
> lowercase-then-uppercase manner, the heading at the top of each section
> reads 'A a' and 'B b' -- not the other way around.  Uppercase first would
> seem the better choice to me -- when applicable, that is -- many languages
> don't even have capitals.
>
>      The order throughout the rest of the standard ('rest' meaning
> 'non-English' languages) seems to have a capital then lowercase mentality --
> at least according to the copy of UNICODE 3.0 that I have (unless some big
> decision was made at the last conference in Amsterdam at which, alas, I was
> not an attendee.)
>
>      As many characters in the standard will have multiple mappings, it
> would seem logical that no one great order could be acheived.  I'm not
> entirely sure that what English does (in terms of case-sorting) should
> influence the course of the entire standard as a whole.
>
>      Beyond that, unless I'm mistaken, the reference for 'a', for example,
> is named 'Latin lowercase a' -- not 'English lowercase a'...why does an
> English standpoint even matter ?

We were discussing English sorting, and also multilingual Western European
sorting.  The question of uppercase/lowercase order was specifically about
English though.  THis has nothing to do with Unicode, the standard.  The list is
convenient for other i18n issues.

The order in the character set has to do with history - Unicode did not choose
the positions of A-Z vs. a-z, they were adopted indirectly from ASCII.  Why
those positions are where they are in ASCII is unknown to me, but I doubt there
was linguistic research in sorting done to determine it.  Incidentally, EBCDIC
has the opposite order.

Andrea
--
Andrea Vine, avine at eng.sun.com, iPlanet i18n architect
Guilty feet have got no rhythm.
-- George Michael



More information about the Ads-l mailing list