Semantics and processing

Fri May 24 12:34:07 UTC 1996

> One thing that always appealed to me in the classical LFG architecture
> (Bresnan 1982 and Halvorsen 1983) was that it seemed to offer a series of
> well-formedness domains that could be independently computed: you can
> compute lots of possible c-structures without worrying about anything else,
> thow away most of those, use this small number number of best c-structures
> to compute (possibly) lots of f-structures, thow away most, compute lots of
> s(emantic)-structures, thow away most and get to a small number of
> most-plausible results.

Given my recent posting, I guess I'd better say something about
Bruce's comments!

Disambiguation at each layer does not seem to be the best approach in
the domain of syntax-semantics understanding.  Humans seem to be able
to use semantic and pragmatic information for disambiguation long
before the utterance has ended (and hence before complete
c-structures could be computed).

In terms of natural language parsing, some of the best statistical
disambiguation techniques work by collecting statistics about how
likely various lexical head to lexical head relationships are, e.g.,
how likely this kind of verb is to have an object headed by that kind
of noun.  (The classes used are usually semantically oriented).  One
of the problems in phrase structure based statistical approaches is
that these relationships can be non-local.  LFG should have a great
advantage here, as the f-structure makes all these relationships
local.  So it would seem to make sense to delay this disambiguation 
until at least grammatical functions (i.e., f-structures) have been
identified.

Treebanks marked up with something like LFG's f-structures are now
becoming available.  It should be possible and would be interesting to
use statistics collected from these treebanks to assign a preference
ranking to the set of f-structures produced by an LFG parser.

There are some theoretical problems to be faced if one tries to do
this correctly.  Because of reentrancy in f-structures, probability
distributions over f-structures are technically what is called a
``Markov field''.  The methods available for efficiently estimating,
and perhaps more importantly, efficiently _reestimating_ probability
distributions in syntactic parsing only work for ``branching Markov
processes'', i.e., tree structured objects.

It is not yet clear if this is important in practice.  We might get 
just as good results if we ignored all but one of the incoming arcs
in an f-structure.

Best

Mark