Asian Character Sets and the Insufficiency of Unicode

Grant Barrett gbarrett at WORLDNEWYORK.ORG
Wed Jun 6 02:39:36 UTC 2001

Link stolen from Slashdot:

Unicode, the semi-commercial equivalent of UCS-2 (ISO 10646-1), has been widely
assumed to be a comprehensive solution for electronically mapping all the characters of
the world's languages, being a 16-bit character definition allowing a theoretical
total of over 65,000 characters. However, the complete character sets of the world add up
to over 170,000 characters. This paper summarizes the political turmoil and
technical incompatibilities that are beginning to manifest themselves on the Internet as a
consequence of that oversight. (For the more technically inclined: Unicode 3.1 won't
work either.)

