[Lexicog] Bob Hsu's format and software

William J Poser billposer at ALUM.MIT.EDU
Mon Mar 22 23:34:19 UTC 2004


Bob Hsu calls his dictionary database format "band format",
where "band" is equivalent to "field". It is like Shoebox format
in that fields are of arbitrary length and identified by a tag,
but it differs from it in several respects, which are significant
in that they make his format hard to parse using standard tools.

First, fields end at linefeed unless the following line begins with
two spaces. You can see the origins of this convention in punch cards.
Second, a record begins with a line-initial period not immediately followed
by another period. He used sequences of two or more periods as prefixes
to tags to mark subentries.

There is a description of band format with examples in some
of my lecture notes at: http://www.ling.upenn.edu/courses/ling538/Lecnotes/ParsingLexica.html#bandformat.

He had a book-length manuscript on linguistic computing that contained
a lot of useful information, the kind of thing that is hard to find
written down anywhere, which as of 1995 seemed pretty polished to me.
Unfortunately, to my knowledge he has never published it.

Bill


--
Bill Poser, Linguistics, University of Pennsylvania
http://www.ling.upenn.edu/~wjposer/ billposer at alum.mit.edu



Yahoo! Groups Links

<*> To visit your group on the web, go to:
     http://groups.yahoo.com/group/lexicographylist/

<*> To unsubscribe from this group, send an email to:
     lexicographylist-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
     http://docs.yahoo.com/info/terms/



More information about the Lexicography mailing list