Data exchange with SignPuddle Markup Language

Steve Slevinski slevin at SIGNPUDDLE.NET
Fri May 28 13:45:40 UTC 2010


On May 27, 2010, at 6:24 PM, Jonathan y Yolaine wrote:

> > 
> > Hi Steve,
> >     I like how you made the item generic both for  BSW and for other languages.  It makes for a flexible structure that would even support sign languages back to back.!!!   Also this way it would support unlimited terms or glosses for each entry.
>   

Hi Jonathan,

I'm glad you like SPML.  I still to make a few tweaks, but I like it 
too.  One thing I really like is the ability to go from markup to 
database.  I created an SPML to SQL script.  Next I'll need to reverse 
it so I can go from SQL to SPML.  You can see both the SPML and the SQL 
for the English/ASL User Interface:
http://signbank.org/signpuddle2/data/ui/1.spml
http://signbank.org/signpuddle2/data/ui/1.sql

> > 
> >     Is SPML an exchange format just for the dictionaries or also for the literature?
>   
The SPML exchange format will be used for both.  The <entry> element can 
have 3 literature oriented attributes: top, previous, and next.  These 
provide the linking between entries and will be used for literature with 
multiple pages.  Eventually, the markup will probably need to be 
customized to differentiate dictionaries from literature.

> > 
> > There are 5 things that I would like to suggest.
> > 1. Add a Global unique ID (GUID) for each entry so that it will be easy to compare whether we are exchanging an entry that we already had.  The "entry id" good for the SignPuddle database, other programs may not have the same id in their database.  The GUID permits you to detect whether we are talking about the same entry or not.
>   
A global unique ID may be useful and can be constructed right now.  
There are 2 main types of puddles: ui and sgn.  Each puddle has a unique 
ID.  So entry 7 is the user interface puddle number 1 would have the 
unique ID of "ui.1.7".  I need to add the puddle type and id as 
attributes to the <spml> tag.

> > 2. Add a time stamp to each entry, so that we can compare to of the same signs and help decided if we should update the one we have, or not change anything if the sign is older or the same as the one we already have.
>   
This is already there.  Entries and items have 2 dates: CDT (created), 
and MDT (modified).  Unix time stamps are used.

<entry e_id="2" cdt="1172438830" mdt="1173731285">
  <item i_id="3" lang="ase" cdt="1172438830" mdt="1173731285">
    <term t_id="3" 
index="29e062e062e070c0ed40">0080ed46f9caf9bb29f0f9e1f9e062e0f9c4f9ea62e0f9d4f9fb70c1f9d0f9d9</term>
  </item>
  <item i_id="4" lang="en" cdt="1172438830" mdt="1173731285">
    <term t_id="4"><![CDATA[User Interface]]></term>
    <text><![CDATA[Contains the text used for the display.]]></text>
  </item>
</entry>

> > 3. Add an identifier to the item tag to specify whether it is a BSW string or other foreign string.  Of course you could always use the an algorithm to determine this but when you have a really long list it is much faster to read the value that to decided which it is for each one.  
>   
Interesting idea.  I'm still mulling this over.

> > 4. It seems to me that the "src" tag should belong to the entry tag and not to the item tag.  It is repeated many times.  You may have other reasons for this structure which may not be apparent to me at this time.
>   
There are 2 reasons to repeat the source.  First, one person can add an 
ASL sign and another person could add the BSL.  Second, one person could 
add an ASL sign and another person could add a different ASL sign.


> > 5. If you are only going to have one name tag inside the item tag and the item tag doesn't have any other children, then the name tag is redundant.  Though it might add to legibility. 
>   
Is it easier to be redundant with a simple definition without special 
exceptions.  Also there can be multiple terms, but only one text.  
Without the <term> or <text> identifier, we don't know which one we have.

> > What do you have in mind for annotating the colors and sizes of the symbols. Would SPML be able to share information about color and size of the symbols from one program to another?
>   
No color or size information is contained in SPML.  Symbols within a 
sign are always the same size.  Color and size should be controlled by 
CSS, but that's a work in progress.

> > 
> > Here is a rough sample of my suggestions:
> > 
> > <entry id="29" guid="cc2e7faa-1d56-4b0c-a7ee-2d2b96241471" timestamp="2009-11-17T06:57:38.4375-06:00"> 
> >     <item lang="ase" type="bsw">00800112f9c4f9d6011af9abf9de6640f9c6f9e570c4f9c6f9f4</item> 
> >     <item lang="en" type="other">cannot</item> 
> >     <item lang="en" type="other">can't</item> 
> >     <src>Valerie Sutton</src> 
> > </entry> 
> > 
>   
Thanks for including an example. 

> > Thank you for all of your hard work on these standards.   They help bring the whole together.
> > 
> > Jonathan
> > 
Thanks for taking the time to consider and comment.

Regards,
-Steve



More information about the Sw-l mailing list