forum

Tue Feb 26 23:53:42 UTC 2008

Hi Mia,

On 27/02/2008, Mia Kalish <MiaKalish at learningforpeople.us> wrote:

>   So when you go to sort, you get
>  a weird sequences of output sequences, and the average user can't grok
>  what's happening.

This is a software issue, collation sorting routines should be able to
work on multiple characters, not just single characters.

If data is normalised, and you have software that has properly
implemented Unicode collation and allows you to specify language
specific collation, it should be possible to sort a letter that
includes a combining diacritic correctly, after all some languages
need to be able to sort digraphs and trigraphs correctly as well.

Its a software limitation. Not a Unicode issue.

>  As for your third paragraph about the Unicode consortium: It is pretty much
>  an exact rendition of the conversation(s) I had with them specifically about
>  this issue. They seem to really, truly believe that if the glyph looks
>  right, the character is also right (character = glyph + code).
>  Now, is a cedilla different from the little hook? Whoever he is, he doesn't
>  go in the same horizontal location for all the vowels. When we develop the
>  glyphs by hand, even if we use the Unicode/Font glyphs that are already
>  existing, we have to eyeball them in so they look nice. It also makes a
>  difference whether the fonts have serifs or not (que serif, serif, whatever
>  will b, we'll c ... )

Like wise a font design issue more than a Unicode issue.

I tend to distinguish between things that the UTC need to do to get
things right and things that developers haven't got right (including
font developers).

Andrew
-- 
Andrew Cunningham
Vicnet Research and Development Coordinator
State Library of Victoria
Australia

andrewc at vicnet.net.au
lang.support at gmail.com