[sw-l] Unicode

Stephen Slevinski slevinski at SIGNWRITING.ORG
Sat Dec 11 23:03:15 UTC 2004


Based on my assessment, three things need to happen.

First, the IMWA needs a home in Unicode.
----------------------------------------
Unicode contains 17 planes.  Plane 0 is the Basic Multilingual Plane (BMP).
Plane 1 is the Supplementary Multilingual Plane (SMP). Plane 2 is the
Supplementary Ideographic Plane (SIP). Plane 14 is the Supplementary
Special-Purpose Plane (SSP). Planes 15 and 16 are the Private Use Planes.

The other planes (3-13) are currently unassigned, and will probably remain
that way until Planes 1, 2, and 14 start to fill up.

Currently SignWriting has been given rows D8, D9, DA, DB in plane 1:
Supplementary Multilingual Plane. Since each row contains 256 characters,
SignWriting has been allocated 1024 characters.  This is not enough.

The best case is that the IMWA is assigned plane 3.  The IMWA contains about
25 thousand characters.  This number will increase by several thousand in
the near future.  The upper limit on the IMWA is not know.  When more
characters are needed, they will be added.

So I assume the first step would be working with Michael Everson to get
plane 3 assigned to the IMWA.  I don't know Michael's fees, and I don't know
how to persuade the Unicode consortium to hand over an entire plane when
planes 1,2, and 14 are not yet full.

Second, create an encoding scheme between Unicode and SSS
---------------------------------------------------------
We need to map the IMWA onto Unicode.  Since the IMWA will grow over time,
we need to leave blanks in the encoding.  This encoding needs to be
bi-directional: from Unicode to SSS and from SSS to Unicode.
Example:
Unicode character 1005 on plane 3 = SSS 02-01-001-01-03-01
SSS 06-01-003-01-03-06 = Unicode character 4564 on plane 3

Most applications that use Unicode will need to be able to implement this
encoding scheme, so simplicity (or an available code library) will be
needed.

Third, create the font
----------------------
The last step is straightforward.  Every character in the IMWA is converted
to a font and assigned the corresponding Unicode id.  It should be possible
to automate this process initially.  But future fonts will need to be
recreated by hand.

In theory, we can skip step 1 and complete steps 2 and 3.  Since no one is
officially using plane 3, we can always get approval after we have it
working.  But I don't know the politics involved.

>From what I've been able to put together, I think that's it.
-Stephen

-----Original Message-----
From: owner-sw-l at majordomo.valenciacc.edu
[mailto:owner-sw-l at majordomo.valenciacc.edu]On Behalf Of Sandy Fleming
Sent: Saturday, December 11, 2004 4:11 PM
To: sw-l at majordomo.valenciacc.edu
Subject: [sw-l] Unicode


Stuart wrote:

> There is plenty of space in Unicode for all of the IMWA. I was talking
> with some people who are involved with Unicode for non-Roman languages
> and I asked that specific question. And they said it would be no
> problem technically.  We just have to do the work to finish what
> Valerie and Michael Everson began.

So what do we actually have to do?

Sandy



More information about the Sw-l mailing list