ISWA 2010 data change for proposed Unicode string

Fri Apr 29 17:29:38 UTC 2011

Hi Jonathan and list,

I am making a small change that will only affect programmers and back 
end data.

We are almost off the bleeding edge.  The Unicode proposal requires a 
change to the SignPuddle data.  After this change, I do not plan any 
additional changes.  A future and final conversion may be needed for a 
Unicode compromise agreement.  No changes are planned for the ISWA 2010 
itself.

I will be updating my documents, code libraries, and test data over the 
next few days.

The primary change moves the fill and rotation codepoints 14 ahead into 
different code chart rows.  This leaves 14 spaces for new root symbols 
to be added in future proposals.  Fill codepoints will start at U+1DA9A 
and Rotation codepoints will start at U+IDAA0.  If a Unicode string for 
a symbol is 3 codepoints long, the 1st character remains the same, but 
the 2nd and 3rd will change.   Each will advance 14 codepoints.

Michael Everson made this change in the Unicode proposal.  It's a good 
change, so I'm including it in the SignPuddle online data conversion.

He is writing a new draft that affects the Unicode world but not the 
SignWriting world.  All hand root symbols will appear using the first 
(empty) palm facing for Unicode code charts.  The new draft isn't ready 
yet, but Michael's previous draft is online.
http://std.dkuug.dk/jtc1/sc2/wg2/docs/n4015.pdf

A secondary change in the proposal is regarding character count, but 
will not affect the proposed symbol strings that I use.  Instead of 
proposing 674 new codepoints, we will be proposing 672.  This compromise 
will leave holes in the code charts for fill-1 and rotation-1.  Unicode 
strings for symbols will assume fill-1 if a symbol string does not 
include a fill characters, and assume rotation-1 if a symbol string does 
not include a rotation character.  A proposed symbol string will be 1, 
2, or 3 characters long.  If approved by the Unicode committees, we will 
achieve 99.7% of the goal and take a huge step forward in standardization.

I will not be removing fill-1 and rotation-1 from the test data.  I 
consider removal of the fill-1 and rotation-1 as Unicode normalization.  
An easy process can search for and deletes these 2 characters wherever 
they exist.  The undo process is more complicated.

The removal of the fill-1 character breaks sorting and complicates 
searching.  The easy way to fix sorting is to use the fill-1 character 
rather than an empty slot.  This solution works for any environment, 
such as mobile, desktop, web browser, and server.

If the first proposal is successful, I plan to champion a second 
proposal to add Fill-1 and Rotation-1 as control characters that 
complete the set.   These characters are useful for programmers.  Fill-1 
and Rotation-1 characters facilitate easier, reusable generic code.  
They eliminate the need to repeatedly test for and handle exceptions.

Regards,
-Steve