[sw-l] challenge for programmers
sw at PASSITONSERVICES.ORG
Wed Jun 22 20:38:09 UTC 2005
See comments below ...
On Jun 22, 2005, at 14:21, Steve Slevinski wrote:
> Hi Stuart,
> I'm still trying to understand this complex topic myself. I'm glad
> that people all over the world are working on this. I do not have the
> answers, only opinions.
> Here are 4 topics that will need to be addressed if you want to use
> Unicode with the IMWA.
> 1) A rendering engine can handle the rotations, but not the fills for
> handshapes. The fills are irregular. Look at symbols 01-01-007-01
> and 01-01-008-01. The base symbol (fill 1 and rotation 1) is the same
> for each. The fills are different. Each handshape has 16 rotations.
> 8 for the right hand and 8 for the left hand. So for each handshape
> symbol, you will need to define the 6 base fills. From these 6 bases,
> you can rotate and flip the image to come up with the 96 images needed
> for each handshape.
Actually, we could have a "helper" symbol that the input process
inserts to indicate fill information, but it can work just as well to
leave it in the font itself.
> 2) Include all 25 thousand plus symbols of the IMWA in Unicode. The
> IMWA is an alphabet used for sorting. Here are 4 handshapes sorted
> If you only included the symbol's base in Unicode and ignored the
> rotations, you would not be able to sort. These symbols would all
> have the same 16-bit Unicode value. You would need to add the
> rotation as an additional 4-bits. In essence you would be using
> 20-bit to store unique handshapes, so why are you using Unicode
Like I said before, we can use "helper" characters that can provide
this additional information for sorting and other purposes. As long as
the relationship is expressed in a straightforward manner and the input
process and the renderer know how to process this, it is not a problem
for Unicode and we retain the same information. Just a matter of
approaching things differently.
> 3) You need a bi-directional conversion from SSS ID to Unicode number
> and back again. This is not as easy as it sounds and will make
> working with the IMWA ridiculously difficult for the lay programmer.
Well, I think Tomas has shown that it is possible. Using Tomas'
approach plus the idea of blank rows and possibly using a different
range for supplemental symbols if necessary, we can make things work in
Unicode without much problem. Once a mapping has been derived, then it
is simply a matter of developing libraries or other such things for
programmers so that most of those calculations are done for them and
they can just include the libraries in their code. Not a problem for
the lay programmer if we give them the tools to process it.
> Let's consider symbol 01-06-002-01-02-01. In Unicode this might be
> character 13431.
> With SignWriter and SignMaker, we've learned that the special
> commands are very important and very powerful. If we want to change
> the fill of the symbol, it is very easy to do with the SSS ID number.
> Add one to the fill position and make sure the symbol exists.
> Using the Unicode value of 13431, we have 2 options. Convert to SSS
> ID and then back again, or work directly with the Unicode value. The
> new value would be 13447, but that's because I know that all 16
> rotations are being used. The IMWA is irregular with fills and
> rotation for non-handshape symbols so determining the correct number
> to add to the Unicode value is not straight forward. Since 16 bits is
> insufficient for a simple (one-line) conversion between SSS ID and
> unique number, the conversion would be a nightmare that would need to
> be recreated in every program language or explicitly defined in a
> database or conversion file(s). The database option would be best,
> but would require over 25 thousand entries and require all SignWriting
> applications to use a database. A flat file conversion would be over
> 4 MB. A good option might be 50,000 small files (2 files for each
> symbol), but that would require 8 MB of disk space.
> 01-06-002-01-02-01.txt = 13431
> 13431.txt = 01-06-002-01-02-01
With summer school, I haven't the time at the moment to work through
your argument in detail. But I generally take the approach that there
are few problems that are too difficult to solve. I believe if we want
a solution, we will find one. If we don't, of course it becomes much
more difficult. ;)
> 4) We will always need the X,Y coordinates when using the IMWA. A
> while back, I discussed this with Antonio Carlos. We don't believe
> there is any other viable solution. Some signs require exact symbol
> positioning. However, YMMV
I agree that X and Y coordinates are needed. I disagree that it has to
be encoded inside the Unicode character itself. The X,Y coordinates
can be separate elements in the linear encoding of a sign written with
Unicode characters. That should not be a problem. But it is the
responsibility of the renderer to resolve that issue and we simply need
to put enough information in the linear stream for the renderer to do
> And a few last thoughts...
> The term Unicode font is confusing and mixes 2 different ideas.
Didn't we have the discussion before about technical terms versus lay
terms? ;) I was using this as a lay term, not a technical term. :)
However, there are fonts that are based on Unicode and there are fonts
that are based on the traditional 255 character set. So it was with
that perspective that I was using the term "unicode font".
> Unicode is nothing more than identifying a unique mental character
> with a unique number. Unicode number 65 is the letter A, but not the
> A on the screen, but the idea of the letter A.
> A font can identify a unique number with a unique physical
> representation. A font file can takes the number 65 and draw a
> picture of the letter A.
> Unicode can work with fonts, but you can use fonts without Unicode.
> I believe that 45 bit SSS ID encoding is the best option if you are
> using fonts. OpenType or SVG Fonts look like the best choices. But
> these files would be huge!
We could certainly do our custom encoding of SW in a font that did not
use Unicode. However, we lose the benefit of Unicode which is that a
person can use 1 font in a document and be able to express whatever
language they want. With this vision, a given number means a specific
symbol in a specific writing system and that symbol only. To create a
custom font encoding only for SignWriting defeats that purpose for
which Unicode was created to solve. Before, each language had its own
fonts and its own encoding. That made it difficult to know which font
to use for which program or document. If you didn't have the right font
or encoding, then it was gibberish. To use a custom approach certainly
makes life easier for us, but it doesn't make life easier for those
outside who might receive a SW document and who may not have the font
or encoding for it. If we do our job right with Unicode, it could be
possible that every computer would then be fitted with a font that also
includes SW just like today I have on my Mac fonts for just about any
major language in the world. During my DOS and Windows 3.1 or 95 days,
I couldn't do that. But that is now a benefit of Unicode.
> When I tackle SVG, I will replace all of the PNG files in the IMWA
> with SVG files. So instead of 25,000 graphics, I will have 25,000 SVG
> files. I will use the current key file which is about 100k. It is an
> elegant solution and very accessible and has none of the short-comings
> and complications of a Unicode implementation.
I think you are selling Unicode short. I have nothing against an SVG
approach. After all, TMTOWTDI. However, I think you are side-stepping
the political benefits of Unicode which would help you in the very
discussions for funding that you discussed in a previous email. If
people knew that SignWriting could be handled like any other language
from a Unicode font, that says something about its establishment as a
mainstream writing system. SVG and other approaches (while certainly
worthwhile) keep SignWriting as a niche writing system rather than a
mainstream writing system in the minds of the average person who uses a
writing system on a computer.
> I do not think that "draw" should be a taboo word within
> SignWriting. My handwriting improved when I started to draw the
> letters of the English alphabet. I started to look at the marks I was
> making on the page and I would aesthetically compare what I was
> drawing with how the letters should look. For me, drawing deals with
> how things look. Writing deals with what things mean. Good
> handwriting is drawing the alphabet while writing your thoughts. You
> can not draw a sentence.
The characterization of a writing system is that you write it, not draw
it. I know Valerie has made that point over and over to us on the
list. And believe it or not, that as a technical term is important to
help people know that we are not just drawing pictographs to
communicate via writing. The image in the mind of the average person is
that we do not draw our letters. We write our letters. We are actually
writing our languages just like any other person on the planet writes
their language with the symbols that they use for their writing
systems. I prefer not to use the term "draw" because again it makes a
distinction between SignWriting and other writing systems and makes
SignWriting appear not to be a valid writing system. You can argue that
we draw the characters as we write it, but that distinction is lost on
the average person who thinks about writing as a linguistic activity
and drawing as a non-linguistic activity.
Perhaps it is a fine distinction philosophically, but it is a
significant one to me.
Our discussion helps me think through these issues, so thank you!
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 10586 bytes
Desc: not available
More information about the Sw-l