[sw-l] Are We Going in the Wrong Direction?

Sandy Fleming sandy at FLEIMIN.DEMON.CO.UK
Fri Dec 10 08:31:56 UTC 2004


Dan wrote:

> If we want to tackle linearity, it's worth looking at where other
> linear systems fail. So SignFont, HamNoSys, Stokoe, and related systems

These fail because they're not visual and don't express all necessary
aspects of sign language. This doesn't apply to SignWriting.

> have a limited number of characters for location, facial expression (in
> the case of SignFont), and movement. However, coercing a 3-D language
> into a 1-D string (a transform of a transform) has its consequences.
> the strings are longer, sometimes torturously so, and another is that
> simultaneity of articulation tends to get lost in the shuffle.

OK, lets make this difficult problem simple!

Firstly, the writing system we're trying to store is 2-D not 3-D, so it's
not as difficult as you might think!

The strings won't be longer. Remember that I once posted to the list a count
of the number of symbols used in signs and this turned out to be comparable
with the characters used in words in oral languages. So if we have the
symbols as fonts it'll come to about the same thing.

But we have to add to that the fact that we have to deal with exact
positioning information somehow. Since I've said we should have punctuation
and spaces, and since punctuation symbols fill the width of a whole column
(or height of a whole row), we can calculate the position of each symbol
from an origin, most naturally the bottom left corner of the previous
punctuation mark or space (or top right, for rows). The unicode Chinese
character methods are no use for SW, so probably the positioning information
will have to be stored after each character in the file.

So that's it, we've got everything stored as linear strings of characters in
the file, though some of these characters express coordinate positions.
There's no need to worry about the order of symbols within a sign, since the
coordinate positions do this job.

You might think the lack of ordering of symbols within the signs would make
it difficult to do string matching. There are two approaches to solving
this.

One is that we define a standard ordering for characters in a sign - it can
be done by a simple three-column sort on character, x-coord and y-coord.
This would make it possible to do exact searches in the file using search
algorithms in oral-language software.

The other is to leave the symbols unordered in the file and write our own
search algorithms. I think this is by far the better way, as SW software
would particularly need fuzzy matching of its own. Such search routines
could also be implemented as macros in existing software which supports
macros, however (in fact writing a macro application within an existing word
processor might be one way of getting ourselves a good sign processor
quickly).

Even better would be to use both these solutions - the software could store
documents with signs with a standard symbol ordering, and this would make it
possible to compare SW documents for similarity using ordinary differencing
programs, as well as making our matching algorithms simpler.

So, there it is - it's easy enough, unless you _make_ it difficult!

Sandy



More information about the Sw-l mailing list