Data exchange with SignPuddle Markup Language

Charles Butler chazzer3332000 at YAHOO.COM
Thu May 27 15:47:34 UTC 2010


How is Hongul (Korean) encoded.  I thought it was spacial characters merged to look like graphics, not a corpus of words.  There are only 20 letters in Korean, yet it does print looking like ideographs.


 



________________________________
From: Steve Slevinski <slevin at SIGNPUDDLE.NET>
To: SW-L at LISTSERV.VALENCIACC.EDU
Sent: Thu, May 27, 2010 11:10:44 AM
Subject: Re: Data exchange with SignPuddle Markup Language

MARIA AZZOPARDI wrote:
> Dear Steve, Val and all the list,
> I attended the LREC 2010 and I must say I was slightly disappointed at the
> very low use of SignWriting in Computer Sign Language linguists. There
> were some researchers that told me they considered SignWriting, but opted
> for HanNoSys. It would be ideal if SignWriting were used, I thought, but I
> probably can't understand the technicalities, as computers are not my
> area.
> Could you explain why the situation is so.
>  

Hi Maria,

The 2 main reasons for the low use of SignWriting in Computer Sign Language linguistics are conceptual and technological. Conceptually, SignWriting requires accepting a new paradigm, while HamNoSys is much more comfortable. Technologically, SignWriting presents unique challenges.

I'll try to explain how SignWriting is different and why there is a technology gap. Some of the details are simplified.


Currently, in the computer world, there are 2 main types of script: one based on letters and the other based on pictographs. Both use a sequential list of characters, either "ABC" or "儷黑". A character is a very technical term that has many definitions, but simply put, a character is a number that can represents a letter or a pictograph. The letters "ABC" are sent by computer as the numbers 65, 66, and 67. The pictographs "儷黑" are sent as 2 numbers as well, such as 234452 and 222763.


Now the question becomes how to encode SignWriting. For the current technologies, the easiest way forward would be to label SignWriting as pictographic and analyze the corpus of each individual sign language. We could define a list of 20,000 signs for ASL. Stamp it as final and then create a font file (like Chinese) that could display those 20,000 signs. However, this list would never change and adding new signs would be laborious. And if this were to be done for all of the world's sign language, we would quickly run out of numbers for characters.


When I look at SignWriting, I don't see pictographs, I see symbols on a 2 dimensional canvas. Current technology can not use characters in 2 dimensional space, only characters in a sequential list. This spatial nature of SignWriting is where the breakdown happens.


Our current technique for SignWriting is to encode the script and not the individual languages. Once we encode the SignWriting script, we can write any sign language. Since, the idea of a spatial script is outside of the current model, we are making our own model. The current SignWriting model is a collaboration between Valerie and myself.

The ISWA 2010 defines the alphabet (graphemes) of the script. An X,Y coordinate based writing style is used to combine the symbols to form signs. Binary SignWriting is the character encoding model that transforms the abstract symbols, structural markers, and numbers into a sequential list of characters.


These developments represent the open standards of SignWriting. These standards were recently finalized and stabilized. With a 10 year freeze on these standards, I believe we are ready for wide spread adoption.

As we overcome the technological barriers, the conceptual barriers will drop as well. I'm predicting an explosion of acceptance for SignWriting. With all that we've done so far, I know we're ready.

Regards,
-Steve
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/sw-l/attachments/20100527/a5bbc97a/attachment.html>


More information about the Sw-l mailing list