Signbox size and coordinate strings

Jonathan duncanjonathan at YAHOO.CA
Fri Oct 7 17:03:26 UTC 2011


Hi Steve,
     I don't remember why you want to use a string in the XML file for 
the signs.  Wouldn't building everything out of XML be easier to work 
with?  Many libraries can parse XML back to objects or save to a 
database to do calculations and searches on.  My feeling is that XML and 
what's in it should be primarily for transporting data.  If you want to 
do special searches, you should save the information to a database, or 
use an in memory representation of data.
     In my personal opinion, information that is one piece in itself 
shouldn't be concatenated with other data and then have to do special 
parsing to get a specific part of it. It's like when people put several 
bits of information into one cell in Excel.  Cell B5="15 gal."  Now try 
to multiply Cell B5 by 6.  You can't.  You need to keep things 
separate.  B5=15 , B6 = "gal." , B7 = B5 * 5
Then if you want to have a nice user output then it's fine to merge it 
all together like B9= "The total is " & B7 & " " & B6
So I don't really like the 6 digits you are proposing below.  But if we 
are going to have to parse it then at least make it easy to distinguish 
the parts.  It think that if you are going to keep the string notation 
then, maybe the information should be enclosed within an identifying 
symbols. Something like

for the coordinates (41,60), (-18,-18) and  (11,-23)
Currently
<term>A????????????M41x60???n18xn18???11xn23</term>

Could become
<term>A????????????M(41,60)???(-18,-18)???(11,-23)</term>

You would have to search for the opening and closing parenthesis, then 
split on the comma.

What about C for coodinate, then the X or Y value + 500 to get the the 
Unicode point value.  One Unicode character for X and one for Y?
Could become

<term>A????????????MC?0??C?????C ?</term>

Or use 1000 Unicode points from another part of the Unicode to represent 
numbers from -500 to +499, just like you've done for the palm facings 
and rotations.  That way you could get rid of the C altogether.  And 
each piece of information is stored individually in it's own Unicode 
character.

If you do go with what is below, I can make it work for my program.  I 
don't have any issues with the new limited size of the axis to -500 to +499

These are my 2 cents worth
I am interested in your thoughts or comments on the above

Jonathan

On 06/10/2011 9:58 AM, Steve Slevinski wrote:
> Hi list,
>
> Here is my current design and a technical discussion.  Any feedback is 
> appreciated.  Please ignore if you don't want to peak under the hood.
>
> Background material:
> =============
> 1) Regular Expressions
> http://en.wikipedia.org/wiki/Regular_expression
>
> 2) Cartesian Coordinates.
> http://en.wikipedia.org/wiki/Cartesian_coordinates
>
> =============
>
> I use Cartesian Coordinates for the SignPuddle data.  We start with a 
> 2-dimensional canvas.  Both the width and the height are divided into 
> specific points to create a grid.  The center of the grid is point 
> (0,0).  The horizontal position is called the X value.  The vertical 
> position is called the Y value.
>
>          -y|
>            |
>            |
>            |
> -x         |          +x
> -----------+------------
>            |
>            |
>            |
>            |
>          +y|
>
>
>
> In my current design, the x and y values are unlimited.  Negative to 
> the top-left.  Positive to the bottom-right.
>
> In general, the challenge I face is to create a string that represents 
> a specific coordinate.  My current string has the form "n100x100" for 
> the coordinate (-100,100)".  Simply replace the "-" minus sign with an 
> "n" and replace the "," comma with an "x".  The purpose of these 
> replacements is to enable double click selection.  The "n" and the "x" 
> continue the string without a character that creates a gap.
>
> Regular Expressions allow for efficient searching and pattern 
> matching.  Regular expressions are simple and powerful when used 
> correctly.  They can easily become overly complex and difficult to 
> understand.
>
> The current coordinate characters can be described with the regular 
> expression pattern:
> "n?[0-9]+xn?[0-9]+"
>
> This can be understood in parts.
>
> n? , may or may not have an "n"
>
> [0-9] , select one value between 0 and 9.
>
> [0-9]+ , select one or more digits
>
> x , match the character "x"
>
> I've run into a problem that general searching is inefficient or 
> slow.  This is due to Unicode and the current form of the coordinate 
> value.  More accurate searching is forcing me use overly complex 
> Regular Expressions features, like negative lookahead.
>
> I think I need to change the form of my coordinates so that searching 
> is efficient and accurate.  I am considering a new form of coordinate 
> string that is a simple value 6 digits long.
>
> The pattern can be described as "[0-9]{6}".   Understood in parts as:
>
> [0-9] , select one value between 0 and 9.
> [0-9]{6} , select six values between 0 and 9.
>
> I will limit both the X and Y axis to the values -500 to +499.  The 
> center is still (0,0).
>
> Here is the coordinate string for (0,0): "500500".  The string is 
> divided in half.  The first 3 digits are for the X value and the last 
> 3 digits are used for the Y value.  Simply subtract 500 from the value 
> in the string.  To go in the reverse, simply add 500 to the value and 
> combine the Y and Y values.  For example, the coordinate (111,111) 
> would have a string of "611611" and the coordinate (-15,-20) would 
> have the string "485480".
>
> Depending on speed experiments, I may duplicate the SignPuddle XML 
> files with ASCII rather then the Preliminary Unicode.  Large files 
> have a lot of wasted overhead processing UTF-8 and Unicode values.
>
> Thoughts? Opinions?
> -Steve
>
>
> No virus found in this incoming message.
> Checked by AVG - www.avg.com
> Version: 9.0.914 / Virus Database: 271.1.1/3942 - Release Date: 10/06/11 12:34:00
>

-- 

*  *

*                             _                        ____                                      *

*  /\                         | |                      (|    \                                     *

*|   |   __    _   _     __, _|_ | |      __,    _   _        |     |         _   _     __    __,    _   _    *

*|   | /   \_/ |/ |   /   |   |   |/ \    /   |   / |/ |      _|     ||    |   / |/ |   /     /   |   / |/ |   *

*  \_|/\__/    |   |_/\_/|_/|_/|    |_/\_/|_/   |   |_/   (/\___/   \_/|_/   |   |_/\___/\_/|_/   |   |_/*

*   /|                                                                                           *

*   \|                                                                                         *

email: duncanjonathan at yahoo.ca <mailto:duncanjonathan at yahoo.ca>
joyoduncan at gmail.com <mailto:joyoduncan at gmail.com>
Cel: 9983-1204
Tel: 2213-5285
Skype: yojoduncan

SignWriter Studio <http://www.signwriterstudio.com/>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/sw-l/attachments/20111007/66b38195/attachment.htm>


More information about the Sw-l mailing list