[Lexicog] Sorting

David Frank david_frank at SIL.ORG
Tue Mar 23 22:52:37 UTC 2004


from David Frank:

I had said, "I use mailmerge format" and John Koontz replied with a
question, "I'm not sure what this is.  A format for creating merged letters
for Word?"

Since you asked, yes, mailmerge format was designed for doing merged letters
in a word processor such as Word. You may have heard of comma-delimited or
tab-delimited database format, and that is basically the same thing. I first
started fooling with mailmerge format back when my word processor was
WordStar, in the days before Microsoft Word. The latter word processor then
adopted a mailmerge format as well. It is possible even to import a
dictionary database that has been stored in mailmerge format into Word and
have that turn it into a formatted dictionary. That is not the approach I
used, but I did experiment with it some. (It would even be possible to
compile and edit a dictionary in Word using its mailmerge features, though I
wouldn't necessarily recommend it.)

An advantage of mailmerge format is that it is a sort of standard in word
processing, and an older standard in commercial database programs. It is not
too hard to do something with it in a word processor. SIL's standard format
is another standard, and you can get a word processor to deal with it but it
takes a bit more work.

With standard format, you know what each field of a record is by its label,
its particular backslash code. You can have a variable number of fields for
each entry, using only those that apply to that particular entry. With
mailmerge format, there are no backslash codes, and every entry or record
has the same number of fields, even if some of them are blank. You know the
first is a lexeme, for example, the second a part of speech, the third a
gloss, and so forth, or however you set them up. The whole record is on a
single line and the fields are separated by the specified delimiter, such as
a tab.

You wrote in a separate message, "The dictionary editors hated those
\xx{...} codes, because they were ugly and unnatural and took a lot
keystrokes and to enter." I too prefer an approach where those entering data
don't have to see backslash codes. With Shoebox (which I don't use), at
least you don't have to type in the backslash codes once you have it set up
the way you like, and I have learned that Shoebox does now do a lot more in
the way of searches than it used to.

After polling the lexicography list, it sounds like Shoebox (or its
successor Toolbox) is still pretty much a standard both inside and outside
SIL for compiling dictionaries. And if I were compiling a dictionary in
standard format, I would probably use Multi Dictionary Formatter to format
it for publication. I am going to tell my consultee that, but also suggest
that he might want to look into LinguaLinks and TshwaneLex.

It sounds like the success of the Hopi Dictionary is based more on the skill
and intelligence of the compilers and editors than on the sophistication of
their software tools.




Yahoo! Groups Links

<*> To visit your group on the web, go to:
     http://groups.yahoo.com/group/lexicographylist/

<*> To unsubscribe from this group, send an email to:
     lexicographylist-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
     http://docs.yahoo.com/info/terms/



More information about the Lexicography mailing list