Language documentation (was Re: Portability)

Mike Maxwell maxwell at umiacs.umd.edu
Tue Mar 30 01:35:52 UTC 2010


Just when you thought this thread was through...

Stefan Müller wrote:
> Well documented code and literary programming sounds very exciting 
> and suited for the more technically oriented linguists, but I think 
> in order to sell the stuff to `normal' linguists one should have 
> a book in addition to a properly documented grammar. In the book 
> one should separate things and present the ideas underlying the 
> code in a more standardized way.

Yes, we're hoping to get that done before the next millenium.  Thus far, 
we have two and a half book-length grammars, and the skeleton of a book 
describing the formal language for morphological description.

> Having something like this
> 
> lrule :=
>   *lrule*      &
>   [ morph #m,
>     LR   lr & [
>          LR_INFLECTED +
>          ],
>     CLE  bool,
>     ARGS < [
>            morph #m
>            ] >
>   ].
> ...
> in the book may frighten people. Instead I would use the standard 
> HPSG notation for the theory sections.

Agreed.  The way we're approaching that in morphology is to use a 
linguisticy (I'm a morphologist, I have a license to invent new words!) 
declarative representation language; it would be analogous to having an 
XML notation for HPSG.  We then have a translator program that converts 
a formal grammar written in this language into the programming language 
of a parsing engine (currently SFST).  The converter is analogous to a 
compiler.

What is missing thus far is a way to display the XML-based formal 
grammar as paradigm tables, phonological rules, etc.  So it's still as 
frightening as your rule above.  We do have plans to make pretty 
displays, it just keeps getting pushed to the bottom of our priority list.

> The technical part could be given in an appendix or even better: 
> The grammar could be distributed with the book and respective 
> test sentences on a live CD.

We believe that it is best to include the technical parts (i.e. the 
actual rules) in the locations of the grammar where those constructions 
are discussed, rather than in an appendix.  At the same time, no matter 
how pretty we make the technical parts, they're still frightening :-). 
Our current work-around is to tag the rules (and any related discussion) 
as being for a technical audience, and we decide at print time whether 
to display the technical sections. (By "print time", I mean when we 
convert the XML into PDF or HTML.)  And of course we would include 
everything on a CD (or just put it on a website).
-- 
    Mike Maxwell
    What good is a universe without somebody around to look at it?
    --Robert Dicke, Princeton physicist



More information about the HPSG-L mailing list