ISWA 2010 data change for proposed Unicode string
Valerie Sutton
sutton at SIGNWRITING.ORG
Fri Apr 29 17:52:02 UTC 2011
SignWriting List
April 29, 2011
Thank you, Steve, for this report and for all your hard work on this issue -
And thank you, Jonathan, too, for working on updating the code for your SignWriter Studio program -
We look forward to using the new software programs that will result from this -
Val ;-)
------
On Apr 29, 2011, at 10:29 AM, Steve Slevinski wrote:
> Hi Jonathan and list,
>
> I am making a small change that will only affect programmers and back end data.
>
> We are almost off the bleeding edge. The Unicode proposal requires a change to the SignPuddle data. After this change, I do not plan any additional changes. A future and final conversion may be needed for a Unicode compromise agreement. No changes are planned for the ISWA 2010 itself.
>
> I will be updating my documents, code libraries, and test data over the next few days.
>
> The primary change moves the fill and rotation codepoints 14 ahead into different code chart rows. This leaves 14 spaces for new root symbols to be added in future proposals. Fill codepoints will start at U+1DA9A and Rotation codepoints will start at U+IDAA0. If a Unicode string for a symbol is 3 codepoints long, the 1st character remains the same, but the 2nd and 3rd will change. Each will advance 14 codepoints.
>
> Michael Everson made this change in the Unicode proposal. It's a good change, so I'm including it in the SignPuddle online data conversion.
>
> He is writing a new draft that affects the Unicode world but not the SignWriting world. All hand root symbols will appear using the first (empty) palm facing for Unicode code charts. The new draft isn't ready yet, but Michael's previous draft is online.
> http://std.dkuug.dk/jtc1/sc2/wg2/docs/n4015.pdf
>
> A secondary change in the proposal is regarding character count, but will not affect the proposed symbol strings that I use. Instead of proposing 674 new codepoints, we will be proposing 672. This compromise will leave holes in the code charts for fill-1 and rotation-1. Unicode strings for symbols will assume fill-1 if a symbol string does not include a fill characters, and assume rotation-1 if a symbol string does not include a rotation character. A proposed symbol string will be 1, 2, or 3 characters long. If approved by the Unicode committees, we will achieve 99.7% of the goal and take a huge step forward in standardization.
>
> I will not be removing fill-1 and rotation-1 from the test data. I consider removal of the fill-1 and rotation-1 as Unicode normalization. An easy process can search for and deletes these 2 characters wherever they exist. The undo process is more complicated.
>
> The removal of the fill-1 character breaks sorting and complicates searching. The easy way to fix sorting is to use the fill-1 character rather than an empty slot. This solution works for any environment, such as mobile, desktop, web browser, and server.
>
> If the first proposal is successful, I plan to champion a second proposal to add Fill-1 and Rotation-1 as control characters that complete the set. These characters are useful for programmers. Fill-1 and Rotation-1 characters facilitate easier, reusable generic code. They eliminate the need to repeatedly test for and handle exceptions.
>
> Regards,
> -Steve
>
More information about the Sw-l
mailing list