ISWA 2010 data change for proposed Unicode string
Jonathan
duncanjonathan at YAHOO.CA
Fri Apr 29 18:04:15 UTC 2011
On 29/04/2011 11:29 AM, Steve Slevinski wrote:
> Hi Jonathan and list,
>
> I am making a small change that will only affect programmers and back
> end data.
>
> We are almost off the bleeding edge. The Unicode proposal requires a
> change to the SignPuddle data. After this change, I do not plan any
> additional changes. A future and final conversion may be needed for a
> Unicode compromise agreement. No changes are planned for the ISWA
> 2010 itself.
>
> I will be updating my documents, code libraries, and test data over
> the next few days.
>
> The primary change moves the fill and rotation codepoints 14 ahead
> into different code chart rows. This leaves 14 spaces for new root
> symbols to be added in future proposals. Fill codepoints will start
> at U+1DA9A and Rotation codepoints will start at U+IDAA0. If a
> Unicode string for a symbol is 3 codepoints long, the 1st character
> remains the same, but the 2nd and 3rd will change. Each will advance
> 14 codepoints.
>
> Michael Everson made this change in the Unicode proposal. It's a good
> change, so I'm including it in the SignPuddle online data conversion.
This sounds like a good improvement to me too.
>
> He is writing a new draft that affects the Unicode world but not the
> SignWriting world. All hand root symbols will appear using the first
> (empty) palm facing for Unicode code charts. The new draft isn't
> ready yet, but Michael's previous draft is online.
> http://std.dkuug.dk/jtc1/sc2/wg2/docs/n4015.pdf
>
> A secondary change in the proposal is regarding character count, but
> will not affect the proposed symbol strings that I use. Instead of
> proposing 674 new codepoints, we will be proposing 672. This
> compromise will leave holes in the code charts for fill-1 and
> rotation-1. Unicode strings for symbols will assume fill-1 if a
> symbol string does not include a fill characters, and assume
> rotation-1 if a symbol string does not include a rotation character.
> A proposed symbol string will be 1, 2, or 3 characters long. If
> approved by the Unicode committees, we will achieve 99.7% of the goal
> and take a huge step forward in standardization.
Why are they proposing to remove fill-1 and rotation-1 codepoints? Are
they assuming that these fills and rotations are used more often and
could therefore save on the length of a document by having symbol of 1
and 2 characters long? Or what exactly is their motivation?
>
> I will not be removing fill-1 and rotation-1 from the test data. I
> consider removal of the fill-1 and rotation-1 as Unicode
> normalization. An easy process can search for and deletes these 2
> characters wherever they exist. The undo process is more complicated.
>
> The removal of the fill-1 character breaks sorting and complicates
> searching. The easy way to fix sorting is to use the fill-1 character
> rather than an empty slot. This solution works for any environment,
> such as mobile, desktop, web browser, and server.
>
> If the first proposal is successful, I plan to champion a second
> proposal to add Fill-1 and Rotation-1 as control characters that
> complete the set. These characters are useful for programmers.
> Fill-1 and Rotation-1 characters facilitate easier, reusable generic
> code. They eliminate the need to repeatedly test for and handle
> exceptions.
So if the first proposal goes through we would have
Fill-2 U+1DA9A
Fill-3 U+1DA9C
Fill-4 U+1DA9D
Fill-5 U+1DA9E
Fill-6 U+1DA9F
Rotation-2 U+IDAA0
Rotation-3 U+IDAA1
Rotation-4 U+IDAA2
Rotation-5 U+IDAA3
Rotation-6 U+IDAA4
Rotation-7 U+IDAA5
Rotation-8 U+IDAA6
Rotation-9 U+IDAA7
Rotation-10 U+IDAA8
Rotation-11 U+IDAA9
Rotation-12 U+IDAAA
Rotation-13 U+IDAAB
Rotation-14 U+IDAAC
Rotation-15 U+IDAAD
Rotation-16 U+IDAAE
Then with the second proposal they would change to this readding the
Fill1 and Rotation-1 or where did you intend on re-inserting Fill-1 and
Rotation-1?
Fill-1 U+1DA9A
Fill-2 U+1DA9B
Fill-3 U+1DA9C
Fill-4 U+1DA9D
Fill-5 U+1DA9E
Fill-6 U+1DA9F
Rotation-1 U+IDAA0
Rotation-2 U+IDAA1
Rotation-3 U+IDAA2
Rotation-4 U+IDAA3
Rotation-5 U+IDAA4
Rotation-6 U+IDAA5
Rotation-7 U+IDAA6
Rotation-8 U+IDAA7
Rotation-9 U+IDAA8
Rotation-10 U+IDAA9
Rotation-11 U+IDAAA
Rotation-12 U+IDAAB
Rotation-13 U+IDAAC
Rotation-14 U+IDAAD
Rotation-15 U+IDAAE
Rotation-16 U+IDAAF
Not only does the first proposal make it hard sort the entries but it
will also be harder to parse because the symbol with sometimes be 1
character long, sometimes 2 and sometimes 3. So extra checking has do
be done to verify if the next character is the start of a new symbol or
a fill and or rotation modifier.
Regards,
Jonathan
>
> Regards,
> -Steve
>
>
>
> No virus found in this incoming message.
> Checked by AVG - www.avg.com
> Version: 9.0.894 / Virus Database: 271.1.1/3604 - Release Date: 04/29/11 00:34:00
>
--
* *
* _ ____ *
* /\ | | (| \ *
*| | __ _ _ __, _|_ | | __, _ _ | | _ _ __ __, _ _ *
*| | / \_/ |/ | / | | |/ \ / | / |/ | _| || | / |/ | / / | / |/ | *
* \_|/\__/ | |_/\_/|_/|_/| |_/\_/|_/ | |_/ (/\___/ \_/|_/ | |_/\___/\_/|_/ | |_/*
* /| *
* \| *
email: duncanjonathan at yahoo.ca <mailto:duncanjonathan at yahoo.ca>
joyoduncan at gmail.com <mailto:joyoduncan at gmail.com>
Cel: 9983-1204
Tel: 2213-5285
Skype: yojoduncan
SignWriter Studio <http://www.signwriterstudio.com/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/sw-l/attachments/20110429/4302e523/attachment.htm>
More information about the Sw-l
mailing list