Filler-gap mismatches

Ron Kaplan kaplan at parc.xerox.com
Sun May 6 23:05:19 UTC 2001


This is in reply to both Carl and Ivan coments on my earlier message ...

Carl asked:

>  > Did I misunderstand what you meant? What you said seemed to imply that
>>  the second conjunct of a complement isn't subject to any categorial
>  > restrictions at all from the head that governs the complement.

and Ivan commented:

>Right. So, for example, be allows any predicative XP as complement,
>but become allows only AP or NP:
>
>Kim became unhappy/a churchgoer/*in love/*talking to the family.
>
>And these category selections are preserved in coordination:
>
>Kim became a churchgoer and *in love.
>Kim became unhappy and *talking to the family.
>
>If you free up selection into noninitial conjuncts, how does this get
>captured?

My reply is quite lengthy, because I think the question gets at some
basic and interesting differences in the way that LFG and HPSG are
organized.

I was a little imprecise in my previous message when I said that the
"only" categorial matching is defined by a general c-structure
cordination rule schema.  That is indeed the only categorial matching
defined by c-structure rules, which is what I was focussing on in the
account of these particular movement-paradox examples.  The rule
schema I described, which matches just the category of the left
conjunct with the category of the coordination, is what allows for
not only the "that you might be wrong" coordinations but also for the
familiar "John is a Republican and proud of it".

Carl and Ivan are bringing up a different question, namely, how (or
whether) a lexical head restricts the c-structure category of a
complement that it governs.  That would interact with the
coordination schema in a set up under which a governor restricts a
c-structure category of a complement and the restriction is taken to
apply to the coordination node if the complement happens to be a
coordination construction.  But then, if we relied solely on the
coordination schema to pass the restriction down to the individual
conjuncts, we would get the wrong result.  Given the looseness of the
coordination schema, the restriction would not apply to the rightward
conjuncts of the coordination, predicting incorrectly that "Kim
became unhappy and talking to the family" would be good.

This issue is interesting because it may highlight some fairly basic
architectural differences between LFG and HPSG. I take it (correct me
if I'm wrong) that an important insight claimed for HPSG is that all
the various aspects of a sentence (phonology, phrase structure,
valence, content....) can (and should) all be represented in a
pretty-much uniform way, as a sign, and that unification (as
indicated by coindexed boxes) can be used to enforce dependencies
both within the substructures for different aspects and across them.
So it is relatively easy and straightforward to have a predicate
impose restrictions on a variety of properties that a complement
might have, without particularly distinguishing phrasal category from
other features like case, number, tense, etc.  The different
restrictions are all on an equal footing, with none of the aspects
having priority over any of the others.

If this roughly characterizes the HPSG architecture, then I would say
that LFG is organized in a quite different way.  The LFG architecture
tries to formalize a notion of modularity, of dividing up the complex
set of syntactic dependencies  in structures (trees and hierarchical
AV matrices) that have mathematically quite different properties and
which are described in quite different ways.  The c-structure trees
are described by context-free rules while the f-structures are
described for the most part by attribute-value equalities.  The other
key idea is that these two structures are related to each other by
what I've called a "structural correspondence", a many-to-one
function that maps the nodes of a tree into units of the f-structure.
It is this function (usually referred to as "phi") that provides an
interpretation for the up-arrows and down-arrows and allows the
constraints on the f-structure associated with a given tree to be
determined from the annotations on the c-structure rules and lexical
entries used in the tree's derivation.

The important point for the present discussion is that it is
impossible to state unification or equality constraints directly over
categories or substructure of the tree--the context-free notation
does not provide for such a capability.  And this correlates with the
fact that subcategorization (which involves a lexical predicate
imposing restrictions on an environment that is not strictly local in
the phrase-structure) is defined in terms of the grammatical
functions that may or may not appear in the f-structure that
phi-corresponds to the governing lexical tree-node.  The Completeness
and Coherence conditions require the corresponding f-structure to
contain all and only the functions selected by the predicate.

The predicate can impose further restrictions on its complements, but
these all have to be mediated by the grammatical-function assignment.
A predicate could assert that it takes a dative OBJ or a plural SUBJ,
because these are f-structure properties, but it can't say directly
that the NP to its right is dative or the NP to its left is plural.
In the first instance, the fact that different complements tend to be
realized by constituents of different categories follows from the
fact that the phrase-structure rules assign different functions to
different categories in different positions.  For example, an NP
cannot be the realization of a COMP function if there is no
phrase-structure rule that associates COMP with the NP category.  The
fact that SUBJ's and OBJ's can be realized as NP's is because
equations assigning those functions are associated with NP categories
in the c-structure rules.  The (at least on transformational movement
theories) surprising properties of the topicalized "think of" example
come from the fact that the topicalized position can assign OBJ or
OBLique to S-bars.

But what about the second instance, the situation where particular
predicates like "become" seem to impose categorial restrictions on
their complements that are more specific than the general phrase
structure rules provide for?  In the LFG literature there has been
some argument to the effect that many of what look like categorial
restrictions are actually restrictions of another sort, for example,
restrictions by semantic types (event, question, state...) that tend
to be realized by particular categories.  I think these arguments can
deal with a fair number of cases, but not necessarily all of them,
and the "become" vs "be" contrast seems to demonstrate the need for a
category-based account of at least some of the cases.

An obvious move would be to copy the category into the f-structure,
and then a simple f-structure constraint in "become" could be used to
restrict the type of complement.  This would both destroy the
representational modularity that we are striving for, and also cause
analyses of many different other phenomena to break down.  The
problem comes from the many-to-one nature of the structural
correspondence:  it would no longer be possible for two nodes of
different categories (say V and VP) to map to the same
f-structure--the standard way in which LFG represents head chains and
propagates feature dependencies.  This would put us on the slippery
slope:  we would have to figure out new ways of classifying features
so that some of them propagate and some of them don't, introduce
conventions that are not presently needed for passing features up
head-chains,... In the end we might arrive at a theory isomorphic to
HPSG.

Many people, particularly those on this list, might regard that as an
appropriate set of moves, but that's not the direction that we have
explored.  Instead, we have examined solutions that maintain the
modularity of representation but exploit the underlying architecture
to allow certain cross-module constraints to be stated.  Thus, we do
not encode in the f-structure any c-structure properties such as
category, dominance or linear order, even though there may be some
lexical predicates that impose special constraints on the phrasal
configurations that can realize their complements.

Recall that an LFG representation consists of a c-structure, an
f-struture and a piece-wise correspondence phi that maps nodes of the
tree into units of the f-structure, and that the correspondence
function is what enables us to generate descriptions of the
f-structures that correspond to particular trees.  Now consider a
particular c-structure, f-structure, and phi-correspondence. For a
given node n, phi(n) gives us the f-structure unit that that node
maps to.  But note that for a given f-structure unit f, phi-1(f), the
inverse of phi, gives us the set of nodes in the tree that f is the
image of.  [phi-1(f) is in general a set, not a singleton, because
phi is in general many-to-one.]  We now observe that the inverse
correspondence quite naturally induces new properties of an
f-structure that hold not by virtue of the features and values that
make it up but rather by virtue of the c-structure/phi configuration
that it is a part of.

For example, even though linear order is not a native property of an
f-structure representation, in a particular c-s/phi configuration we
can say that an fstructure f (functionally) precedes an f-structure g
iff all the nodes in phi-1(f) precede (in the c-structure) all the
nodes in phi-1(g).  This is the definition of functional-precedence
(f-precedence) that has been used since the 80's in the analysis of a
variety of different phenomena.

The more relevant example for present purposes is the functional
category property (call it f-category)  (Note:  I don't mean
functional categories in the sense of DP, IP, etc.--I mean a way of
inducing an association of c-structure categories with f-structure
units by virtue of the inverse-phi correspondence).  Thus, we define
the f-category of an f-structure f as the set
	{c | c is the category of some node in phi-1(f)}
We make it easy to state constraints on f-categories by defining a predicate
	CAT(f, cats)
that holds iff the f-category set of f is not disjoint with the
categories in cats.  Thus CAT(f, {NP AP}) would hold of an
f-structure f in a particular c-structure/phi configuration just in
case at least one of the nodes that maps to f is labeled with NP or
AP.

And now, back to the "become" example.  We would say that the lexical
entry for "become" contains
	(^ PRED)=`become<(^ SUBJ)(^ XCOMP)>'
	CAT((^ XCOMP), {NP AP})
The general c-structure rules would allow the open complement
function XCOMP to be associated with any of VP, PP, AP, NP, and this
would be appropriate for "be".  The CAT predicate added to "become"
would eliminate the ungrammatical examples "John became to go" and
"John became in the park".

What you see, then, is the natural way that the LFG correspondence
architecture allows for a lexical predicate to restrict the
categorial expression of its complement.  This preserves the
modularity of representation--it is accomplished without copying
features into the f-structure against the possibility that some
predicate might want to impose such a restriction.  But it also does
not extend the c-structure component beyond the power of context-free
description.  Just as for subcategorization, the categorial
restriction is mediated by the grammatical function assignments.

To close out the example, and to finish this lengthy essay, let's
consider how this interacts with coordination to account for
	"*Kim became a churchgoer and in love".
The c-structure for this is perfectly acceptable, according to the
general coordination schema--only categorial matching on the first
conjunct.  Indeed, the c-structure schema would also allow the
equally ungrammatical
	"*Kim became in love and a churchgoer"
That's because the general VP expansion rule allows the XCOMP
function to be assigned both to a PP as well as an NP, and the
coordination schema then simply repeats on the left conjunct whatever
was chosen (freely, given that it realizes XCOMP) at the VP level.

The filtering is done by the CAT predicate in the coordination case
as well as the simple-complement case.  Coordinations in LFG are
represented in f-structure as sets containing the f-structures for
the individual conjuncts.  Many syntactic requirements are stated not
in terms of sets but in terms of simple attribute-value structures,
and those properties would not hold if they were applied to a
coordination set instead of a simple structure.  The LFG theory of
coordination extends the truth conditions of simple-structure
properties so that they hold of a set iff they hold of each element
of the set.  In this case, the CAT predicate distributes over the
elements of the coordination set, testing to be sure that each of the
conjunct f-structures is the image of either an NP or AP.  The bad
examples fail this test, no matter which order the conjuncts appear
in.

There were other questions in the original discussion about the
"under the bed" and verbal gerunds.  I think I'll comment on those
examples in a later message--this one has gone on long enough.

I'll just finish by saying that for me, replacing THINK OF with TALK
ABOUT, as Carl suggested, doesn't help.  And to the extent that
making the examples longer and more complex seems to weaken the
judgments of ungrammaticality, I agree with Ivan that this is
probably a performance effect and that a clear account should come
from the psychological study of metalinguistic-judging behavior
(presumably judging behavior guided by either an LFG or HPSG
characterization of knowledge that would mark these as ungrammatical).

--Ron

(P.S.  There are discussions of corresopndence-inverses, f-prececence
and f-category at various places in the literature, references
available on request.)



More information about the HPSG-L mailing list