[Lexicog] dictionary software

Kenneth C. Hill kennethchill at YAHOO.COM
Mon Mar 22 19:20:21 UTC 2004


The Comparative Siouan Dictionary project would have been a daunting task
for Notebook if only because myriad fields were needed. In the Hopi
Dictionary project we used 19 fields. A blank record allows 20 fields to
be seen at on time. I have a comparative Uto-Aztecan database with 50
fields. I believe the CSD needed even more. And a feature of Notebook is
that every field appears on the screen for every record. This takes up a
lot of visual space for simple entries. The printed result, of course,
will include only fields with content and only the fields you want to
print. The Hopi Dictionary database had several housekeeping fields which
did not get printed.

Regarding sorting: Notebook allows ascending or descending sorting on any
field. Also, you can sort on one field, then sort on another field, etc.
And the sorting can affect the database itself or only a view of the
database. Notebook also allows the definition of the order to be defined.
Thus we were able to place ö in the alphabet following o. If we had been
working on Swedish, ö would be ordered at the end of the alphabet. Sorting
can be case sensitive or not.

Headwords can contain various markers that complicate sorting. Consider an
example like kiita 'build a house', whose plural subject form is kiitota.
This appears in the Hopi Dictionary as kii|ta (~tota), the vertical bar
marking the point after which the word is variable under inflection. We
don't want "|" to confound alphebetization and I know of no way to tell
Notebook to ignore a symbol. Further, there are sets like:

naavan|ta 1 (~yungwa) 'wear a shirt, have a shirt on'
naavan|ta 2 (~yungwa) 'have a characteristic or trait like one's father'
naavàn|ta 1 (~tota) 'put on a shirt; make a shirt'
naavàn|ta 2 (~tota) 'take after one's father's traits'
Naavantaqa 'any kachina whose costume includes a velvet shirt'

Here we want to guarantee that the 'shirt' words precede the otherwise
identical 'father' words and that the words with the accented vowel follow
the the words that are otherwise identical but lack the accent, and that
the accented vowel itself not figure in the sorting so that the two words
naavànta precede Naavantaqa.

This problem is solved by having a field dedicated to sorting. I call this
the alphebetizer. This field is, in general terms, identical with the
headword field except that it leaves out markers that get in the way of
correct sorting (for kii|ta the alphabetizer is kiita) and includes
markers to guarantee desired sorting (for naavan|ta, etc., the
alphabetizers are naavanta 1, naavanta 2, naavanta 3, naavanta 4,
naavantaqa).

We did not concern ourselves with the luxury of screen fonts. Linguists
should be able to handle just about any kind of representation. It's only
the final printout that matters. Thus we used `A for what would get
printed as À, ô for ö with acute accent (a character not yet supported by
Unicode I am told), etc. In my Uto-Aztecan database I use things like ï
for barred i, kW for k followed by superscript w, dR for retroflex d
(printed as d subdot), etc.

--Ken Hill

--- Koontz John E <john.koontz at colorado.edu> wrote:
> On Sun, 21 Mar 2004, Kenneth C. Hill wrote:
> > The Hopi Dictionary Project used (and I continue to use) a DOS program
> > called Notebook II, by a now-defunct company called Pro/Tem Software.
> > Notebook II is the best software I have yet discovered for dictionary
> > making, but it works only in a DOS window and uses only ASCII
> characters.
> > It might work on a Mac if one has Soft-PC.
>
> The Comparative Siouan Dicitonary project looked at Notebook and another
> program called AskSAM that was somewhat similar.  We ended up using
> AskSAM, which still seems to be around, amazingly enough.  I think we
> picked one over the other because of factors like amenability to
> handling
> extended character sets, absolutely essential to us, but I'm no longer
> sure.  I remember Notebook looked very nice.  Anyway, ultimately we were
> mostly handling the text files in something like SIL's SFM with various
> programmer's text editors.
>
> What you say about field stucturing and searching in Notebook applies in
> AskSAM, though the interface in AskSAM is less tabular and perhaps in
> that
> way less appealing for a number of purposes.  The output and formatting
> expedients were the same.  We used AWK programs in the CSD.
>
> Both of the packages were examples of what were called textbase
> programs.
> I think Notebook was theoretically aimed at the "personal information
> maager" market, while AskSAM seemed to be intended to handle things like
> legal briefs.  In the end you could use both somewhat like Shoebox in a
> period before Shoebox existed.  My suspicion is that someone who wanted
> to
> use something like one of these programs at present would be better
> advised to look at Shoebox, which has much the same facilities and more
> of a purpose-built dictionary orientation.
>
> Incidentally, the SAM in the name AskSAM seems to stand for IBM's SAM
> structure for file management - Sequential Access Method - more standard
> databases often use or used to use ISAM or "Indexed SAM."  I think
> relational tables were stored as SAM Or ISAM files.
>


__________________________________
Do you Yahoo!?
Yahoo! Finance Tax Center - File online. File on time.
http://taxes.yahoo.com/filing.html


------------------------ Yahoo! Groups Sponsor ---------------------~-->
Buy Ink Cartridges or Refill Kits for your HP, Epson, Canon or Lexmark
Printer at MyInks.com.  Free s/h on orders $50 or more to the US & Canada.
http://www.c1tracking.com/l.asp?cid=5511
http://us.click.yahoo.com/mOAaAA/3exGAA/qnsNAA/HKE4lB/TM
---------------------------------------------------------------------~->


Yahoo! Groups Links

<*> To visit your group on the web, go to:
     http://groups.yahoo.com/group/lexicographylist/

<*> To unsubscribe from this group, send an email to:
     lexicographylist-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
     http://docs.yahoo.com/info/terms/



More information about the Lexicography mailing list