[sw-l] Sign Variants via SWML

Sandy Fleming sandy at FLEIMIN.DEMON.CO.UK
Mon Oct 18 19:06:34 UTC 2004


Hi Bill, Dan, Antônio Carlos et al!

Bill, I think we need to be careful not to worry too much about the glosses,
as they're not actually part of the language being defined by the dictionary
(I agree that "vocbulary" is a more appropriate term). They're more like
useful pegs to relate the dictionary language with some oral language the
users may be famlilar with.

Bill wrote:

>> I believe the problem with defining the collection of signs as a
"dictionary" is that we tend to attribute the values of a dictionary to it.
In truth, however, what is being collected is not a dictionary as much as it
is a vocabulary.  If we would to expand the concept of the dictionary...<<

While this is worthwhile work I don't think it has much to do with
dictionaries as we need them (or "vocabularies" as you say). We want
dictionaries for data input purposes, more like the word processor
spellchecking dictionaries, which are indeed just vocabularies. The
"variants" idea is rather like the "stemming" features of some word
processors (which enables them to tell that "walk, walks, walked, walker,
walking" are all, in a sense, the same word).

Dan wrote:

>> This of course assumes that sacrificing storage for more optimal searches
is acceptable. <<

I think it is acceptable. XML files for serious applications always get
extremely large anyway, but they compress well and this is the answer to the
storage problem.

I think the stuff you're suggesting is more like taking SWML to the next
level, as Antônio Carlos was speaking about before, whiile what I was
suggesting was keeping SWML at a simple level, but taking advantage of a
natural feature of sign languages - the way they vary the meanings of a sign
in a transparent way by holding most things constant and varying a single
(is it always single?) parameter. When entering a sign a dictionary we're
often conscious that there are other straightforward variants of the sign
but there's the problem of filling the dictionary up with variants that the
user has to choose from with little guidance other than the glosses, which
don't really map well to the variants. The variants feature I'm suggesting
is intended as a simple, direct and natural answer to this problem, for
users and dictionary editors alike. As I see it, it's not a case of "Let's
add more features", it's more like "We might have missed something
essential".  :)

Antônio Carlos, in response to Dan, wrote:

>> I feel that his observation is accurate. Searching for signs with
variants will probably have to be dealt with mainly in the application
program, outside the DOM. That is, first doing a generic search
disregarding variants in the DOM, then filtering variants outside the
DOM. <<

Firstly, to Dan, one of the main things about variants is it's easy to
generate all the signs for each variant, from the variant description. Don't
you think it might be possible to write a set of XSLT templates to generate
the "full" form of the dictionary from the "variants" form?

I'm not sure that there need ever be such a thing as being "outside the DOM"
when it comes to applications. The latest versions of Word, for example, use
XML, but the DOM provides a large number of collections like sentences,
words, characters, tables and so forth that I'm pretty sure can't really be
expressed in the XML files that the Word documents are stored in.
Applications can add anything they need to the structure of their DOM,
whether actually stored in the data structure or just provided as
subroutines emulating the structure. So we could have variants in the XML as
I've described, it doesn't mean the programmer can't have them expanded out
into multiple-sign structures for searching.

At the same time, I'm mot sure there is any problem with searching, even in
variant forms. Presumably the user has supplied some search criteria, and if
they match any variant of a sign, they actually will match the sign in its
variant form, because each variant of each feature is actually present. The
matched variant can then be extracted and presented to the user in the same
way as the application would present any variant to the user.

But in summary, remember that the idea here is to represent an actual
phenomenon that arises when building vocabularies in sign languages, because
of a difference in the nature of sign languages when compared to oral
languages, in that oral languages are largely just codification, whereas
sign languages are broadly iconic. The variants in the suggested SWML just
express the way iconicity is used to generate variants in the actual
language.

Sandy



More information about the Sw-l mailing list