[Lexicog] Preparing one's Shoebox dictionary for the publisher?

Koontz John E john.koontz at COLORADO.EDU
Fri Mar 5 16:08:36 UTC 2004


On Thu, 4 Mar 2004, Chaz and Helga Mortensen wrote:
> The recent editions of Shoebox (I have 4.0) allow one to export the Shoebox
> files, which use standard format markers (\lx, \dn, etc.) into a Word
> document. Each standard format marker is assigned a style in Word. The
> Export option is under File and then it will give you some dialog boxes
> where you can choose which fields are exported into the document. You also
> choose which national language you will use with the indigenous language. I
> am assuming here you are making a bilingual dictionary. The instructions are
> pretty straightforward.

I believe there is also a separate tool SF Converter (see www.sil.org
under downloadable software) that can be used.  I'm not really up to date
on the relative strengths of the two approaches.  SFC also allows
associating the format markers with Word styles.  You define a style sheet
in Word to manage the formatting.  SFC runs separately from Shoebox,
reading the Shoebox data file and producing an rtf file that Word can
read.  Once you get to a doc file you can produce other formats that your
publisher may prefer, and modify the style as necessary by changing the
style sheet.  Given the flexibility of Word formatting, you can insert
numbering, change fonts and faces, etc., in this way.

> Once the document opens in Word you can edit it. When I export mine there
> are usually extra carriage returns, spaces and soforth that I delete. You
> will probably run into things like that before the document looks like
> something you would want to turn in to the printers.

It's very important to resist the urge to edit the publication format of
the database.  Above all never change the data or its ordering at this
stage, because, of course, this results in the database (Shoebox) file and
publication file getting into inconsistent states, and in producing a
publication you are not abandoning the database.  In the modern world the
database format is at least as important as the publication format,
because it underlies computerized access to the data, and, probably,
Web-based access.  Also further editing because, of course, no dictionary
is ever truely finished.

However, as Chaz points out, the Shoebox and SFC conversion processes may
not suffice to produce a publication format.  For one thing, there's a
certain inflexibility (as I recall - I may be out of date) in the
conversion process.  This involves what one can convert to paragraph
styles, and what to styles applying to stretches of text within a
paragraph.  In essence the problem is, given a field = paragraph equation,
that the field, or paragraph, structures of database format and
publication format are not commensurate.  Generally you want to
consolidate the fields of an entry into a single paragraph, reducing the
fields of the database to stretches of text, perhaps in bold or italic, or
numbered, or whatever.  Hence the deletion of "extra carriage returns,
spaces and so forth."

Depending on your resources and abilities, I recommend doing as much as
possible by means of a preprocessing program that maps the field structure
of the database to a field structure that Shoebox or SFC can convert
appropriately.  This program may divide or consolidate fields, reducing
them to field-internal markup, or rearrange them, even generate new
fields.  There are a great many tools that can be used for this, including
scripts in the scripting language of text editors, the SIL CC tool, AWK,
Tcl/Tk, and so on.  Any programming language or scriptable text processing
tool will do.

The main difficulty here is that this conversion process inevitably does
involve some degree of programming, and this may or may not be feasible
for a given linguist or linguistic project.  It would be highly desirable
if some of the more typical processes could be incorporated into the
standard facilities of Shoebox or SFC.  Perhaps I am out of date enough
not to know that facilities for this now exist.



Yahoo! Groups Links

<*> To visit your group on the web, go to:
     http://groups.yahoo.com/group/lexicographylist/

<*> To unsubscribe from this group, send an email to:
     lexicographylist-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
     http://docs.yahoo.com/info/terms/



More information about the Lexicography mailing list