[Lexicog] preparing Shoebox lexicons for publication/ export
List Facilitator
lexicography2004 at YAHOO.COM
Fri Jan 16 01:33:46 UTC 2004
----- Original Message -----
From: "Kenneth Keyes" <ken_keyes at sil.org>
To: <lexicographylist at yahoogroups.com>
Cc: "Eric and Allison Albright" <eric-allison_albright at sil.org>
Sent: Wednesday, January 14, 2004 9:08 AM
Subject: RE: [Lexicog] preparing Shoebox lexicons for publication/ export
> Dear Mike,
>
> Eric Albright (eric-allison_albright at sil.org) has developed a
spellchecking
> engine for Microsoft Word 2000 and above that can be compiled from a
Unicode
> word list. Presumably, this engine can be applied to a wordlist from any
> language, provided that #1. You are running Windows 2000, or XP, #2. you
are
> using a valid input locale id for which no spellchecker has yet been
> developed, e.g. Icelandic, #3. you really have used Unicode to compile the
> list, not some legacy ANSI font.
>
> We are currently using it in our language, but it has some bugs.
>
> Ken
> -----Original Message-----
> From: mcswell2001 [mailto:maxwell at ldc.upenn.edu]
> Sent: Wednesday, January 14, 2004 1:46 AM
> To: lexicographylist at yahoogroups.com
> Subject: [Lexicog] preparing Shoebox lexicons for publication/ export
>
> I'm a former SIL member, now working at the Linguistic Data
> Consortium at the University of Pennsylvania. Over the years, I've
> preferred to work on grammars--but dictionaries keep winding up on my
> desk instead.
>
> There are two projects here at LDC using Shoebox to compile lexicons,
> and as a consultant on these projects I was again reminded of how
> inconsistent users can be. Some of the inconsistencies could have
> been cleared up by the use of features in Shoebox (v5) such as range
> sets. But there are other sorts of problems that Shoebox (and so far
> as I know, Toolbox) doesn't help with. One of these is spell
> correction; so I wrote an import-to-Word macro which, given the
> correspondence between fields and language, automatically assigns a
> language to each region of text, so that Word's spelling correctors
> can be applied. (I hasten to add that as soon as one has done the
> spelling correction, the file is brought back into Shoebox. No
> dictionary compilation inside Word!)
>
> Anther issue I'm aware of is the ordering of fields. While Shoebox
> allows you to define a correct ordering, it does not enforce it, and
> one can introduce errors in the order. These are perhaps best found
> by exporting to XML, and running an XML validating parser (in
> conjunction with a DTD) over the exported file. There remain issues
> of how to fix the errors (in Shoebox vs. in the XML file).
>
> Another question is how to do cross-language comparison of lexicons,
> given that the fields different lexicographers use may not
> correspond. If everyone used MDF, that wouldn't be a problem, but
> that's sort of like saying that if everyone used English, there
> wouldn't be any problems. (There was some discussion of this in the
> EMELD '02 workshop.)
>
> I'm thinking of doing a paper for the upcoming LREC workshop on
> minority language documentation, on the topic of how to prepare
> Shoebox lexicons for publication or export, with emphasis on fixing
> problems with (in)consistency. (Before publishing, one might also
> want to look at coverage, agreement between the grammatical
> categories in the lexicon and those in published grammars for the
> language, etc., but this is outside of what I want to cover, as are
> basics of using MDF.) I am aware of work by Ken Zook (of SIL) on
> importing earlier Shoebox lexicons into LinguaLinks, the web pages
> for the KirrKirr project on importing into that tool, and the
> addition of the 'verify interlinear' feature to Toolbox. Beyond
> this, I haven't seen much.
>
> Has anyone seen other sorts of problems for which additional tools or
> checks are needed in order to ensure consistency (or more generally,
> quality) in Shoebox lexicons? I'll be happy to cite you, should this
> idea turn into a real (and accepted) paper.
>
> Mike Maxwell
>
>
>
>
> Yahoo! Groups Links
>
> To visit your group on the web, go to:
> http://groups.yahoo.com/group/lexicographylist/
>
> To unsubscribe from this group, send an email to:
> lexicographylist-unsubscribe at yahoogroups.com
>
> Your use of Yahoo! Groups is subject to:
> http://docs.yahoo.com/info/terms/
>
>
>
>
> Yahoo! Groups Links
>
> To visit your group on the web, go to:
> http://groups.yahoo.com/group/lexicographylist/
>
> To unsubscribe from this group, send an email to:
> lexicographylist-unsubscribe at yahoogroups.com
>
> Your use of Yahoo! Groups is subject to:
> http://docs.yahoo.com/info/terms/
>
>
More information about the Lexicography
mailing list