[Lexicog] dictionary software

Kenneth C. Hill kennethchill at YAHOO.COM
Tue Mar 23 21:44:44 UTC 2004


The process of producing elegant printed copy from the Hopi database is
much simpler than what John Koontz describes for the Comparative Siouan
Dictionary below.

One defines a "report" format within Notebook then "prints" selected
records to a text file. Then the text file is modified in one's word
processor to produce the printed result. The only serious departure from
this has to do with the addition of graphics and tables. Graphics and
tables do not reside in the database, which strictly a textual database
(or, to use John's term, "textbase"). (I guess one could also add
embellishments like sound files and video clips, but they also go far
beyond the capacities of a DOS textbase program. Besides, in the Hopi
Dictionary Project, we were trying to produce a product that looked like a
serious desktop reference book; sound files and video clips don't have any
place in such a product.)

In defining the report format, one specifies which fields should be
printed and in which order and what markers or words should always occur
in the printed result for that field. One also marks a field as not to be
printed if it is empty. Hard returns do not figure in the definition
unless one wants to make sure the contents of some field always begin on a
new line. (I used marked hard returns only in preparing printouts for
editing.) Each field can have a default font specified. I use [[ for
"begin boldface, ]] for "end boldface", << for "begin italics" and >> for
"end italics". I also use [[, <<, etc. in the database itself for font
changes.

I had excellent macros in WordPerfect 6 for DOS to convert the font change
markers [[, <<, etc. into word processor font changes, but I failed when I
tried to recreate these macros in Windows versions of WordPerfect. Rather
than trying to reeducate myself on the newer versions of WordPerfect, I
have simply continued to use WordPerfect 6 for DOS for the font changes
and then, as a second step, I process the data further with the current
version of WordPerfect, making the conversions of things like `A to À, ^A
to Á, inserting graphics, etc. (I reject MS Word [_pace_ Bill Gates] as
too user-unfriendly. I have yet to hear from any MS Word enthusiast
anything that would convince me to want to use it even though I must from
time to time do so because of all those institutions out there that won't
accept files in any other form.)

-- Ken Hill

--- Koontz John E <john.koontz at colorado.edu> wrote:

> On Mon, 22 Mar 2004, David Frank wrote:
> > I guess my next question is what you used to turn your Hopi dictionary
> > database into a formatted document for printing.
>
> I can't answer for Hopi, but the CSD - somewhat in abeyance at present -
> uses or used various AWK scripts to do the lion's share of this the
> reformating of the text files.  (AWK is a bit awk-ward for this task,
> but only because it is a somewhat peculiar programming language.)
> Today I'd personally use Tcl or Python, and Perl would also do nicely.
> These are all available for Windows and so are various other scripting
> tools.  (In the really old days I once used the Lisp-ish language built
> into EMACS and most programmers' editors have something like that built
> into them.  I really don't recommend these!)
>
> However, none of this would have been much use if I hadn't had access to
> several DOS-based tools - I don't remember the names - from
> SIL/Phillipines that were able to convert a somewhat extended form of
> SFM into a Word file.  Each field \xx was converted to a paragraph
> formatted with the xx paragraph style and - this was the extension to
> SFM - each segment of text coded with \xx{..\} or later \xx{..} was
> converted to a segment of text formatted with character format xx.
> That and a nice style sheet of your choice (now called a formatting
> template, I think) and you were home free.
>
> Well, almost.  Two problems.  One, the dicitonary editors hated those
> \xx{...} codes, because they were ugly and unnatural and took a lot
> keystrokes and to enter, and because they prevented them from searching
> the database.  One of those around or within a word made it a lot harder
> to match.  I think they also just didn't like to think about even
> structural markup while writing.  That and the ugly and unnatural I
> never solved.  I did reduce the key strokes and the searching problems
> through the expedient of letting them enter |xx to start a code (or end
> a preceding one) and | to revert to no special coding.  Delineate these
> with spaces and you're OK.  However, that won't work to make marks
> within words invisible, only marks around them.  Fortunately, we had
> few cases of the latter, though only by luck.  Anyway, with a little
> judicious scripting you can convert the |xx and | notation to the |xx
> {...} notation, deleting extra spaces as appropriate.
>
> The other problem was that the SIL/Phil tools never liked files as long
> as we had.  They ran into internal size limits and blew up in mid
> file.  So I wrote tools to break the files into shorter pieces and then
> merged the mini Word files by hand.
>
> Without these SIL tools I'd have had to write my own tools to convert
> the marked up text into RTF, which is a sort of generic formatting
> language that Microsoft devised and accepts as input to Word.  Nowdays
> some form of XML or XHTML might work, too.
>
> Today I'd use SF Converter, but might still have to do some massaging of
> the SFM before applying it.  I think I discussed those issues a while
> back, so I'll omit them here.  They have to do with the relationship
> between SFM (or comparable) fields and text paragraphs in various forms
> of text derived from the database.  In general it's a many to one
> mapping relation that depends on the formatting desired and the use the
> text is put to.


__________________________________
Do you Yahoo!?
Yahoo! Finance Tax Center - File online. File on time.
http://taxes.yahoo.com/filing.html



Yahoo! Groups Links

<*> To visit your group on the web, go to:
     http://groups.yahoo.com/group/lexicographylist/

<*> To unsubscribe from this group, send an email to:
     lexicographylist-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
     http://docs.yahoo.com/info/terms/



More information about the Lexicography mailing list