[sw-l] challenge for programmers - SSS-ID mapping onto Unicode

Thu Jun 23 03:08:03 UTC 2005

Hi Tomas,

Rotation is needed in the IMWA because rotation doesn't always mean
rotation.  Load up SignMaker and drop a face, then press rotate.

Or try the contact symbol and press rotate.

Rotation only means mathematical rotation with handshapes and some arrows.

The more we discuss this, the less value I see in Unicode.  If sorting
and special commands (Variations, Mirror, Fills, and Rotations) can only
be done with the SSS ID numbers, what value does Unicode offer?

There may be a psychological factor, but if someone doesn't accept sign
language or SignWriting to start with, using the term Unicode isn't
going to convince them of anything.

Even once the IMWA is encoded into Unicode, none of the existing word
processors will be able to properly use the symbols to create signs.
Word processors use Unicode character by character in a linear
sequence.  SignWriting uses symbols in space.  Of course we will not
encode the XY coordinates in Unicode, but the symbols only become signs
once we put the symbols in space using XY coordinates.

-Steve

Tomas Klapka wrote:

>Hi all ;)
>
>I like this discussion, it is very interesting.
>
>If I say "IMWA finished" I don't mean the IMWA finished finished :)
>I think it could be mapped onto Unicode before it is "finished" because the
>term "finished" is very relative.
>If I understand it well, IMWA is meaned to be an alphabet for all possible
>movements. It will take ages to finish it.
>I agree that we can take some "stable" part of the IMWA and it can be mapped
>onto it.
>There can be white spaces, which can be mapped later it is called "reserved".
>
>Well, I think it is possible to map the IMWA onto Unicode soon and it could be
>the part of IMWA which is stable and it is supposed to be never changed.
>
>Every new Unicode standard comes with additions and new characters.
>
>Val, I don't think it is needed to map characters by frequency in languages.
>My opinion is, that it should be mapped by stability of the symbol. If there
>is a group of stable symbols, it is possible to map them onto Unicode and it
>doesn't matter if it is in a SSS order or frequency order because of the
>SSS-ID from/to Unicode convert table, which has to be used if there are only
>16 bits supposed to be for IMWA now. 16 bits is not enough to map SSS-ID.
>Is 16 bits the maximum given tu IMWA? I think the Unicode has the mechanism to
>encode more bits.
>But if there is only 16 bits there is no way IMHO.
>
>I don't think it is needed to map IMWA in the order of SSS. Plenty of
>characters isn't mapped by any order in the Unicode (neither Czech alphabet -
>special characters /with diacritical marks are in EXTENDED LATIN chart and
>there is a czech letter 'ch' which is ordered between 'h' and 'i' and not
>somewhere close to 'c' letter {between 'cg' and 'ci'}/ and I think it is not
>mapped onto Unicode because it is just linear sequence of two existing Unicode
>symbols 'c' and 'h').
>
>Well I have more ideas and opinions which came on my mind.
>
>Now, I don't think there should be rotation mapped onto Unicode. Because the
>purpose of the Unicode is to give unique number to a symbol. If you mirror the
>symbol, or if you rotate the symbol in 90 degrees it is still the same symbol
>and can be used the same font (it is easy if fonts are vector). I can rotate a
>latin text in whatever angle in a standard word processor.
>If x, y coordinates aren't supposed to be in Unicode, why there should be a
>rotation?
>
>Well if we have 65,536 values for SignWriting in the Unicode to map, we can
>count...
>
>There are 6 Fills, which are needed to be mapped:
>
>65,536 / 6 = 10,922 values for base-symbols (Category-Group-Symbol-Variation)
>6 different Fills can be mapped onto 3 bits, which can be used to map 8 values
>(2 more not used values).
>I think those 2 values can be reserved for any adittional Fill invented in the
>future or it could be used by a special chars or there could be adopted any
>other script.
>If we use those 3 bits for Fill, there are 13 bits left for base-symbols.
>In 13 bits there can be 8,190 values (2^13 or 65,536 / 8).
>
>Now there is 425 base-symbols used in IMWA 2004.
>
>I think there will never be more than 8,190 base-symbols in IMWA, and if yes,
>it is in a far future and those symbols could be mapped in another Unicode
>layer (or there could be used the reserved space I mentioned together with
>giving 3 bits to Fills... there are 2 more Fills which arn't used /now, but
>maybe later?/ in 3 bits and if we use those 2 values to indicate there are 3
>other bits for Fills, there could be stored 2,048 /1,024 for every of the not
>used Fill - 7 and 8/ more values /10,240 values for base-symbols and 682
>values which can be reserved for special purposes/).
>
>Now if it is used as I say... there is need for SSS-ID from/to Unicode convert
>table.
>
>We can have a table of 10,922 values (or 8,190, or 10,240 - depends on the
>mapping).
>
>If SSS-ID without rotation (which is not supposed by me to be in the Unicode)
>is represented by xx-xx-xxx-xx-xx mask, it is 99-99-999-99-99 with highest values,
>so it is 100-100-1000-100-100 (with zero value).
>100 can be saved in 7 bits and 1000 can be saved in 10 bits-
>it is 7-7-10-7-7 bits = 38 bits = 5 bytes (40 bits).
>
>Our Unicode has 16 bits = 2 Bytes.
>
>Well the row of the convert table can be stored in 5 bytes.
>And 10,922 rows * 5 Bytes = 54,610 Bytes (54 kB) large convert table.
>
>It can be saved more economically if the table is saved in a bit level (not in
>the Byte level as I've written above)
>There are two blank bits in SSS-ID representation (without rotation) - we need
>38, but to fill up the Byte there is need of 40 bits.
>If there are 3 last bits for Fill, those bits are the same 3 bits in Unicode,
>so there is no need to convert these 3 bits. So it is 31 bits for SSS-ID and
>13 bits for Unicode which is 44 bits.
>44 bits * 8,190 values = 360,360 bits and it is 45,045 Bytes (44 kB)
>It is minimal space saving, so I think bit level tabel is useless, because it
>could be more slow to manipulate on the bit level.
>
>If we use textual representation for SSS-ID (still without rotation), I mean
>"ccggsssvvf" where cc is for category, gg is for group, etc. (for example
>0102024013 is 01-02-025-01-03) it has 10 bytes plus 2 bytes for Unicode. It is
>131,064 Bytes (128 kB) for the table.
>
>Well, it is on the renderer how it converts SSS-ID from/to Unicode. I think
>54,610 Bytes is fine convert table, which could be fast to search/seek/manage.
>But maybe I am wrong ;-)
>
>Some base-symbols has less than 6 Fills... those not used Fills would be blank
>(sometimes seen in the current Unicode).
>
>I think it is important to map IMWA onto Unicode because of the
>standardization and implementation.
>Sure, the Unicode is not the only solution for SW expansion, but I just feel
>it as I feel it ;o)
>
>Thanks,
>
>Tomas
>
>
>
>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/sw-l/attachments/20050622/32ad1109/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: moz-screenshot-11.jpg
Type: image/jpeg
Size: 2966 bytes
Desc: not available
URL: <http://listserv.linguistlist.org/pipermail/sw-l/attachments/20050622/32ad1109/attachment.jpg>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: moz-screenshot-12.jpg
Type: image/jpeg
Size: 2382 bytes
Desc: not available
URL: <http://listserv.linguistlist.org/pipermail/sw-l/attachments/20050622/32ad1109/attachment-0001.jpg>