forum

William J Poser wjposer at LDC.UPENN.EDU
Wed Feb 27 00:15:28 UTC 2008


Andrew,

I agree except that it DOES matter whether a character is available
precomposed. The problem of multiple representations is indeed solved
by the use of normalization, though it is taking a while for normalization
libraries to become available for all languages and for all software
that should be using them to use them. But even with normalization,
it is an additional pain to process text in which some characters
require two or three codepoints while some require only one. Not that
it can't be done, but it makes life more difficult.

Bill



More information about the Ilat mailing list