[Lexicog] preparing Shoebox lexicons for publication/ export
List Facilitator
lexicography2004 at YAHOO.COM
Fri Jan 16 01:31:15 UTC 2004
----- Original Message -----
From: "Ron Moe" <ron_moe at sil.org>
To: <lexicographylist at yahoogroups.com>
Sent: Tuesday, January 13, 2004 5:26 PM
Subject: RE: [Lexicog] preparing Shoebox lexicons for publication/ export
> Concerning fixing ordering problems in Shoebox-- I use Consistent Changes
> (CC) tables to automate the reordering of fields. However there are limits
> to what I can do with them. Essentially I move one field next to (before
or
> after) another field. This only works if the second field is present. It
> also wouldn't work with multiple fields within the sense cycle, but I
might
> be able to write a table to do it. The CC tables are included in the DDP
> version 2 release.
>
> I can think of three possible methods to ensure consistency. (1) Use a
> program like LinguaLinks which enforces field order. This has the drawback
> of preventing the flexibility needed to handle special problems and needs.
> Does anyone know if LinguaLinks allows any flexibility in field order? (2)
> Use the template feature in Shoebox to automatically enter blank fields
into
> the database when you add a new entry. This has the drawback of cluttering
> the entry with empty and unnecessary fields. (3) Use an automated routine
to
> add a field to each entry in the database. This is the solution I am using
> in the DDP. It adds the field consistently to every record in the correct
> place. You can then use automated routines to populate the field. If it is
> not possible to automate the process, you can at least populate the field
in
> a single pass through the database. This increases consistency of the
field
> contents and is more efficient than filling out all the fields one entry
at
> a time. Of course this third solution presupposes that you have collected
> the words already. But with the DDP we collect all the words first anyway,
> so no problem.
>
> I've also exported databases into WORD in order to use its spell checker.
>
> Ron Moe
>
> -----Original Message-----
> From: mcswell2001 [mailto:maxwell at ldc.upenn.edu]
> Sent: Tuesday, January 13, 2004 12:46 PM
> To: lexicographylist at yahoogroups.com
> Subject: [Lexicog] preparing Shoebox lexicons for publication/ export
>
>
> I'm a former SIL member, now working at the Linguistic Data
> Consortium at the University of Pennsylvania. Over the years, I've
> preferred to work on grammars--but dictionaries keep winding up on my
> desk instead.
>
> There are two projects here at LDC using Shoebox to compile lexicons,
> and as a consultant on these projects I was again reminded of how
> inconsistent users can be. Some of the inconsistencies could have
> been cleared up by the use of features in Shoebox (v5) such as range
> sets. But there are other sorts of problems that Shoebox (and so far
> as I know, Toolbox) doesn't help with. One of these is spell
> correction; so I wrote an import-to-Word macro which, given the
> correspondence between fields and language, automatically assigns a
> language to each region of text, so that Word's spelling correctors
> can be applied. (I hasten to add that as soon as one has done the
> spelling correction, the file is brought back into Shoebox. No
> dictionary compilation inside Word!)
>
> Anther issue I'm aware of is the ordering of fields. While Shoebox
> allows you to define a correct ordering, it does not enforce it, and
> one can introduce errors in the order. These are perhaps best found
> by exporting to XML, and running an XML validating parser (in
> conjunction with a DTD) over the exported file. There remain issues
> of how to fix the errors (in Shoebox vs. in the XML file).
>
> Another question is how to do cross-language comparison of lexicons,
> given that the fields different lexicographers use may not
> correspond. If everyone used MDF, that wouldn't be a problem, but
> that's sort of like saying that if everyone used English, there
> wouldn't be any problems. (There was some discussion of this in the
> EMELD '02 workshop.)
>
> I'm thinking of doing a paper for the upcoming LREC workshop on
> minority language documentation, on the topic of how to prepare
> Shoebox lexicons for publication or export, with emphasis on fixing
> problems with (in)consistency. (Before publishing, one might also
> want to look at coverage, agreement between the grammatical
> categories in the lexicon and those in published grammars for the
> language, etc., but this is outside of what I want to cover, as are
> basics of using MDF.) I am aware of work by Ken Zook (of SIL) on
> importing earlier Shoebox lexicons into LinguaLinks, the web pages
> for the KirrKirr project on importing into that tool, and the
> addition of the 'verify interlinear' feature to Toolbox. Beyond
> this, I haven't seen much.
>
> Has anyone seen other sorts of problems for which additional tools or
> checks are needed in order to ensure consistency (or more generally,
> quality) in Shoebox lexicons? I'll be happy to cite you, should this
> idea turn into a real (and accepted) paper.
>
> Mike Maxwell
>
>
>
>
> Yahoo! Groups Links
>
> To visit your group on the web, go to:
> http://groups.yahoo.com/group/lexicographylist/
>
> To unsubscribe from this group, send an email to:
> lexicographylist-unsubscribe at yahoogroups.com
>
> Your use of Yahoo! Groups is subject to:
> http://docs.yahoo.com/info/terms/
>
>
>
>
>
> ------------------------ Yahoo! Groups Sponsor ---------------------~-->
> Buy Ink Cartridges or Refill Kits for your HP, Epson, Canon or Lexmark
> Printer at MyInks.com. Free s/h on orders $50 or more to the US & Canada.
> http://www.c1tracking.com/l.asp?cid=5511
> http://us.click.yahoo.com/mOAaAA/3exGAA/qnsNAA/HKE4lB/TM
> ---------------------------------------------------------------------~->
>
> Yahoo! Groups Links
>
> To visit your group on the web, go to:
> http://groups.yahoo.com/group/lexicographylist/
>
> To unsubscribe from this group, send an email to:
> lexicographylist-unsubscribe at yahoogroups.com
>
> Your use of Yahoo! Groups is subject to:
> http://docs.yahoo.com/info/terms/
>
>
More information about the Lexicography
mailing list