[Lexicog] Collaborative lexicography software?
Kenneth Keyes
ken_keyes at SIL.ORG
Sun May 4 16:04:29 UTC 2008
Dear Mike et al,
Hi! Please permit a slight diversion from collaboration to touch on mobile
computing. This discussion about collaborative software had me take another
look at "WeSay" The screenshots of "We Say" remind me of
MobiCreator(commercial freeware from Amazon) available for PocketPC and
Symbian, maybe even Palm. (I'm hoping that I could compile WeSay for my
PocketPC). It has a simple template for making dictionaries which outputs an
XML file. Mobicreator uses that XML file to compile a dictionary file which
can have varying levels of encryption. I'm toying with the idea of modifying
XML output from FLEx or MDF to to plug into their template, instead of
laboriously re-inputting manually all the information that has taken years
to collect.
The point is, we can make the fruits of our labors available on people's
cellphones. Why not? According to BBC, the fastest growth of cell phone
usage are among the "poorest of the poor". Let's harness this emerging
technology to support language preservation!!! Maybe we could compile WeSay
to work on someone's Symbian OS cellphone. Again, why not?
Which leads me to a collaboration idea. Why not load the outputted back up
XML files from different edits of the same project into UltraCompare, and
use UltraCompare to merge the files, and then plug the merged XML file into
a backup zip, and restore from the XML? Do you think it might work?
An additional collaboration comes from someone on the FLEx list. This person
has written about using VCN, a very simple client that allows one to access
the application from a host computer accross a network. I think this is very
promising, since it does away with the neccessity of having to install FLEx
software on older computers, and works across platforms. If I understand
correctly, I could even access the FLEx project from a VCN client on my
trusty PocketPC!
I hope this gives people some good ideas.
Ken
_____
From: lexicographylist at yahoogroups.com
[mailto:lexicographylist at yahoogroups.com] On Behalf Of maxwell at ldc.upenn.edu
Sent: Saturday, May 03, 2008 12:55 AM
To: lexicographylist at yahoogroups.com
Subject: Re: Re : [Lexicog] Collaborative lexicography software?
Quoting Heather Souter <HYPERLINK
"mailto:hsouter%40gmail.com"hsouter at gmail.-com>:
> I, too, am very interested in learning about dictionary development
> for languages with complex morphologies. ...
> Any insight into how to create dictionaries that are useful to
> speakers and learners and not only language specialists would be
> especially welcomed!
One "solution" (quote marks explained at the end of this msg) is to
give people a computer program that allows them to look up words
regardless of the inflected form that they type in. For the simple
cases, this can often be done by just looking for a substring of the
typed-in word. For a purely suffixing language, the substring would
begin at the first letter of the typed-in word.
Of course, the simple cases are not the ones where people need the most
help. The complex cases--where there is prefixing (or worse, both
prefixing and suffixing), or infixing, or reduplication, or lots of
stem allomorphy---are the ones where people need help, and where the
simple solutions don't work. For these morphologically complex
languages, there needs to be a morphological parser between the user
and the electronic dictionary per se. The parser's job is to remove
all the suffixes, undo any stem allomorphy, convert the stem into a
dictionary citation form, and finally look up the citation form in the
actual dictionary.
One project that is building such tools in a generic fashion (i.e. in a
way that should be portable to more languages, as opposed to a
proprietary way that just works for French, say), is a Department of
Education funded project at the Linguistic Data Consortium (LDC).
There's an example of how this works (for Arabic) at
HYPERLINK
"http://projects.ldc.upenn.edu/art/reader/source/Al-Kitaab.01."http://projec
ts.-ldc.upenn.-edu/art/reader/-source/Al--Kitaab.01. In this
case, the lookup is limited to the text shown there, but a simple
modification would allow the user to type in words to be looked up.
The project is also demonstrating lookup with the same tool on (a
dialect of) Nahuatl, a morphologically complex language of Mexico.
(Disclaimer: I'm a consultant on this project, hence biased :-).)
There are of course other reasons (besides morphology) that make it
hard for people to look up words in dictionaries, such as spelling.
One can imagine inserting a spell corrector between the user and the
electronic dictionary. For morphologically complex languages, such a
spell corrector will almost certainly have to be based off of a
morphological parser.
And of course my whole long-winded answer presupposes that electronic
dictionaries (and the computers that they run on) are a reasonable
solution for the language speakers. For speakers of languages in
California, that's probably true; for speakers in the Amazon, that may
not be a solution at all.
Mike Maxwell
CASL/ U MD
----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.
No virus found in this incoming message.
Checked by AVG.
Version: 7.5.524 / Virus Database: 269.23.7/1410 - Release Date: 5/1/2008
5:30 PM
No virus found in this outgoing message.
Checked by AVG.
Version: 7.5.524 / Virus Database: 269.23.8/1412 - Release Date: 5/2/2008
4:34 PM
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/lexicography/attachments/20080504/92ac774f/attachment.htm>
More information about the Lexicography
mailing list