[sw-l] challenge for programmers

Steve Slevinski slevin at SIGNPUDDLE.NET
Wed Jun 22 19:21:18 UTC 2005


Hi Stuart,

I'm still trying to understand this complex topic myself.  I'm glad that
people all over the world are working on this.  I do not have the
answers, only opinions.

Here are 4 topics that will need to be addressed if you want to use
Unicode with the IMWA.

1) A rendering engine can handle the rotations, but not the fills for
handshapes.  The fills are irregular.  Look at symbols 01-01-007-01 and
01-01-008-01.  The base symbol (fill 1 and rotation 1) is the same for
each.  The fills are different.  Each handshape has 16 rotations.  8 for
the right hand and 8 for the left hand.  So for each handshape symbol,
you will need to define the 6 base fills.  From these 6 bases, you can
rotate and flip the image to come up with the 96 images needed for each
handshape.

2) Include all 25 thousand plus symbols of the IMWA in Unicode.  The
IMWA is an alphabet used for sorting.  Here are 4 handshapes sorted
alphabetically.


If you only included the symbol's base in Unicode and ignored the
rotations, you would not be able to sort.  These symbols would all have
the same 16-bit Unicode value.  You would need to add the rotation as an
additional 4-bits.  In essence you would be using 20-bit to store unique
handshapes, so why are you using Unicode

3) You need a bi-directional conversion from SSS ID to Unicode number
and back again.  This is not as easy as it sounds and will make working
with the IMWA ridiculously difficult for the lay programmer.

Let's consider symbol 01-06-002-01-02-01.  In Unicode this might be
character 13431.


With SignWriter and SignMaker, we've learned that the special commands
are very important and very powerful.  If we want to change the fill of
the symbol, it is very easy to do with the SSS ID number.  Add one to
the fill position and make sure the symbol exists.  01-06-002-01-03-01


Using the Unicode value of 13431, we have 2 options.  Convert to SSS ID
and then back again, or work directly with the Unicode value.  The new
value would be 13447, but that's because I know that all 16 rotations
are being used.  The IMWA is irregular with fills and rotation for
non-handshape symbols so determining the correct number to add to the
Unicode value is not straight forward.  Since 16 bits is insufficient
for a simple (one-line) conversion between SSS ID and unique number, the
conversion would be a nightmare that would need to be recreated in every
program language or explicitly defined in a database or conversion
file(s).  The database option would be best, but would require over 25
thousand entries and require all SignWriting applications to use a
database.  A flat file conversion would be over 4 MB.  A good option
might be 50,000 small files (2 files for each symbol), but that would
require 8 MB of disk space.
01-06-002-01-02-01.txt = 13431
13431.txt = 01-06-002-01-02-01

4) We will always need the X,Y coordinates when using the IMWA.  A while
back, I discussed this with Antonio Carlos.  We don't believe there is
any other viable solution.  Some signs require exact symbol
positioning.  However, YMMV

-----------------------------------------------------------------------------------

And a few last thoughts...

The term Unicode font is confusing and mixes 2 different ideas.

Unicode is nothing more than identifying a unique mental character with
a unique number.  Unicode number 65 is the letter A, but not the A on
the screen, but the idea of the letter A.

A font can identify a unique number with a unique physical
representation.  A font file can takes the number 65 and draw a picture
of the letter A.

Unicode can work with fonts, but you can use fonts without Unicode.  I
believe that 45 bit SSS ID encoding is the best option if you are using
fonts.  OpenType or SVG Fonts look like the best choices.  But these
files would be huge!

When I tackle SVG, I will replace all of the PNG files in the IMWA with
SVG files.  So instead of 25,000 graphics, I will have 25,000 SVG
files.  I will use the current key file which is about 100k.  It is an
elegant solution and very accessible and has none of the short-comings
and complications of a Unicode implementation.

----------------------------

I do not think that "draw" should be a taboo word within SignWriting.
My handwriting improved when I started to draw the letters of the
English alphabet.  I started to look at the marks I was making on the
page and I would aesthetically compare what I was drawing with how the
letters should look.  For me, drawing deals with how things look.
Writing deals with what things mean.  Good handwriting is drawing the
alphabet while writing your thoughts.  You can not draw a sentence.

Enough rambling.  I need to get back to work.
-Steve

Stuart Thiessen wrote:

> See comments below ...
>
> On Jun 21, 2005, at 21:15, Valerie Sutton wrote:
>
>>
>> OK. What about SVG? I remember years ago, Antonio Carlos came to
>> visit me from Brazil, and was eager to explain both SWML and SVG to
>> me...I remember feeling amazed at the possibilities when he showed me
>> a SignWriting symbol being drawn on the web in front of my eyes in
>> SVG...Now that we see that SWML is really becoming important, I
>> wonder if SVG isn't next?
>>
>
> SVG is a way of encoding images in XML.  So SWML and SVG would be
> another approach.  However, if you want to think about it politically,
> it would be perceived as the difference between writing SW and drawing
> SW.  Unicode would be perceived as writing SW.  SVG could be perceived
> as drawing SW.  That doesn't mean that SVG is a bad technology for
> SW.  Anything and everything that can help us make SW more available
> is important to use.  I'm just talking about the politics and how
> hearing might perceive it.
>
>> That does not mean that I don't think Unicode is a terrific idea..it
>> is just that Unicode takes money and time, and if PNG display is the
>> only alternative right now, then maybe SVG could be another
>> alternative until Unicode is available for SignWriting?
>>
>
> Yes it certainly is one alternative we should consider.
>
>> Did you know that the French have interest in developing a way to
>> apply SignWriting to Unicode? I wonder if Mr. Dalle and Mr. Aznar
>> from France wouldn't be interested in working with SIL on the Unicode
>> project? Do you think SIL could be interested?...
>
>
> While I can't discuss any details at this time because many details
> are still in the air, I know that both SIL and Pass It On Services
> will want to see the development of an international team so that
> various perspectives and SL backgrounds are included.  We have not
> gotten to the details on that point, so I can't make any firm
> statements one way or another at this time.  But we would certainly
> could be open to their involvement.
>
>>
>> Thanks for your patience with me and all those symbols in the
>> IMWA!...I actually am not necessarily in favor of placing the whole
>> IMWA into Unicode. I think we should do a Symbol-Frequency test on
>> dictionaries to pin down the symbols that you really are using, and
>> then use the Language-specific symbolset to be the first SignWriting
>> Unicode...in other words...Unicode US, Unicode NO, etc...based on
>> only those SignWriting symbols used in one language...why slow down
>> the Unicode development for SignWriting, just  because DanceWriting
>> has not been entered into the IMWA yet? And is there really a Unicode
>> for music sounds? No. So why should DanceWriting be in
>> Unicode?...Unicode should be for SignWriting specific to one sign
>> language...
>
>
> Actually, I do believe musical notation is already in Unicode 3.1  in
> Blocks (U+1D000 to U+1D0FF) and (U+1D100 to U+1D1FF).  All the symbols
> are available, but Unicode itself does not specify how the symbols are
> to be used.  That is left to the program and the renderer itself.
>
> So, yes, I actually do think that we should have the entire IMWA
> available in Unicode. Unicode does not divide characters by language,
> but by writing system.  So you have the Latin or Roman characters in
> one space, and the Greek writing in another space, and the Japanese in
> another space.  But the separation is based on writing system, not on
> languages themselves.
>
> The same would be true of SignWriting.  All of the symbols should be
> available in Unicode at some point. Then different sign languages can
> use the code points that relate to their sign language. So that would
> make SignWriting very handy because it wouldn't matter what kind of
> movement writing you are doing as long as the software knows how to
> assemble the symbols into SignWriting or DanceWriting or whatever.  I
> could write in DGS or LSM or ASL and it wouldn't matter.  That is the
> purpose of Unicode anyway.
>
> Thanks,
>
> Stuart
>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/sw-l/attachments/20050622/c23d1480/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: moz-screenshot-5.jpg
Type: image/jpeg
Size: 2033 bytes
Desc: not available
URL: <http://listserv.linguistlist.org/pipermail/sw-l/attachments/20050622/c23d1480/attachment.jpg>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: moz-screenshot-7.jpg
Type: image/jpeg
Size: 1027 bytes
Desc: not available
URL: <http://listserv.linguistlist.org/pipermail/sw-l/attachments/20050622/c23d1480/attachment-0001.jpg>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: moz-screenshot-9.jpg
Type: image/jpeg
Size: 977 bytes
Desc: not available
URL: <http://listserv.linguistlist.org/pipermail/sw-l/attachments/20050622/c23d1480/attachment-0002.jpg>


More information about the Sw-l mailing list