Language documentation (was Re: Portability)

Stefan Müller Stefan.Mueller at fu-berlin.de
Thu Mar 25 15:33:30 UTC 2010


Hi,

> I would be very interested to see what this looks like with a well-commented
> HPSG grammar.  (And to learn what kind of best practices it would require
> me to adopt in writing my---already prolific---comments.)  We are also starting
> to look into ontological annotation of grammars (with GOLD), which might
> connect here.

Well documented code and literary programming sounds very exciting and
suited for the more technically oriented linguists, but I think in order
to sell the stuff to `normal' linguists one should have a book in
addition to a properly documented grammar. In the book one should
separate things and present the ideas underlying the code in a more
standardized way.

I think the best way is this: a book describing a grammar is organized
in chapters consisting of three parts: phenomenon, theoretical analysis,
alternatives (if there are any). The phenomenon discussion may (should)
contain naturally occurring examples and maybe even several instances of
a certain phenomenon in case of controversial issues (example claim: you
cannot extrapose NPs in German since case assignment is to the left,
extraposed NPs would not get case and this would violate the case
filter, claim is wrong -> data is needed). You do not necessarily want
to have all these examples in a test suite, since they may involve
phenomena that you cannot analyze yet or do not want to treat in the
book/implementation.

Having a clear separation between data and analysis makes half of the
book usable to researchers who are interested in the description of the
respective language. They do not have to look at the brackets ...

Having something like this

lrule :=
  *lrule*      &
  [ morph #m,
    LR   lr & [
         LR_INFLECTED +
         ],
    CLE  bool,
    ARGS < [
           morph #m
           ] >
  ].

or this

cont_hcons_identical_lr :=
(synsem:loc:cont:Cont,
 h_cons:HCons,
 dtrs:[(synsem:loc:cont:Cont,
        h_cons:HCons)]).

in the book may frighten people. Instead I would use the standard HPSG
notation for the theory sections.

This is what we are doing in the books on Persian and Danish that are
currently prepared.

The technical part could be given in an appendix or even better: The
grammar could be distributed with the book and respective test sentences
on a live CD.

This is what I did with the HPSG grammar of German, an understudied
language spoken in countries with several beaches:

http://hpsg.fu-berlin.de/~stefan/Pub/hpsg-lehrbuch.html

This book is a textbook. In a real descriptive grammar the phenomenon
sections would be much longer ...

Best

	Stefan


-- 
Stefan Müller       Tel: (+49) (+30) 838 52973
                    Fax: (+49) (030) 838 4 52973
Institut für Deutsche und Niederländische Philologie
Deutsche Grammatik
Habelschwerdter Allee 45
14 195 Berlin

http://hpsg.fu-berlin.de/~stefan/

http://hpsg.fu-berlin.de/~stefan/Babel/Interaktiv/






More information about the HPSG-L mailing list