ignore previous
Carl Pollard
pollard at ling.ohio-state.edu
Wed May 9 07:26:50 UTC 2001
This corrects an error in my previous posting, which should
be discarded.
(CAUTION: VERY LONG AND SOMETIMES TECHNICAL REPLY)
Hi Ron,
>
Finally, you commented that "at least the analog of `set' in some
interpretation of higher-order logic.
(For them to LITERALLY be sets would mean that the interpretation of the
grammar logic was a Henkin model, but that is another story.)"
Actually, my intention in setting up LFG was to use the simplest and
most naive mathematical structures I could think of,
>>
I think in formal theorization about ANYTHING, one ALWAYS tries to use
the simplest math possible - you only go to harder math if you can't
get the easier math to work. Conversely, if you see how to get easier
math to work, you go for it. The latter, in my opinion (which Drew
Moshier is sure to disagree with) is exemplified by HPSG moving from
an early 1980's `partial information' `unification-based' formalism
based on Scott domains (Pereira and Shieber, Moshier, Rounds and
Kasper) and intuitionistic logic to the mid-to-late 1980's
`total-object' `constraint-based' formalism based essentially on with
partial algebras (Smolka, Johnson, King). ) If LFGs were written in a
model-theoretically interpreted formal language (maybe they have been
and I just didn't know), then maybe it would be appropriate to say that
this is what LFG was doing all along (but, given the way that
minimal models come into LFG, maybe not, since this requires essentially
thinking of f-structures with different degrees of instantiation being
ordered by subsumption, which fits more with the unification-based,
intuitionistic setup).
>
and indeed, I
really did intend that LFG sets would be, literally, the simple sets of
ordinary ZF set theory.
>>
Do you still? When I read Mary's and your recent paper, I had reason
to wonder whether what you referred to as `LFG's sets' really were
sets simpliciter. Here are some examples of the kinds of concerns I
had. On p. 770 (discussion around (45) and (46)) the logic seems to be
inconsistent if the things you call sets are LITERALLY sets. For let P
be the property
P \def=n lambda x ((w CASE) = x).
Then for s = {ACC,GEN}, you would have
P(s) iff \forall f \in S. P(f).
i.e.
(w CASE) = {ACC,GEN} iff [(w CASE) = ACC) \land (w CASE = GEN)]
But the left-hand side of the latter biconditional is true while the
right-hand side is false. This made me wonder whether there is
something going on with `LFG sets' that makes them unlike ZF sets.
Or maybe the lambda-term I used is not an acceptable LFG property?
(This issue would be clarified if LFG's were actually written
in lambda calculus.)
Likewise, on p. 772 (discussion around (55) and (56)) the logic again
seems inconsistent if what you are talking about are real set-theoretic
sets. We have
(f SUBJ CLASS) \in {7/8,9/10}.
So by (35) (p. 767)
either (f SUBJ CLASS) = 7/8
or (f SUBJ CLASS) = 9/10
Then let s denote the set value of (f SUBJ). Then in the former case, we
have
(s CLASS) = 7/8.
Let P be the property
P \def= lambda x ((x CLASS) = 7/8)
Now by (44) (p. 769),
P(s) iff \forall f \in s (P(f)).
But s = {h,e}, so
(s CLASS) = 7/8 iff
(h CLASS) = 7/8 and (e CLASS) = 7/8.
But (e CLASS) = 9/10, so (s CLASS) \noteq 7/8.
By a similar argument, (s CLASS) \noteq 9/10. Contradiction.
>
And the f-structures are naive tabular
functions, finite sets of ordered pairs
>>
I understand that the domains of these functions are sets of attribute
names. What are their codomains? (If you were specific about
answering this, then f-structures would be members of (labelled)
cartesian products, and the attributes themselves would be the
projections associated with the products. If you are not specific
about answering this, then I remain a little bit unclear about exactly
what an f-structure is.
>
I think several years ago, when Ivan and I were doing the LFG/HPSG
comparison course, we had a discussion about one difference in the
underlying models, noting that in LFG equivalent structures are
considered to equal, whether or not they are asserted to be equal,
whereas this is not true of HPSG models.
>>
That is true, although I also recall from that discussion that there
were non-public ways in LFG to make equivalent-looking structures
unequal, essentially be `uniquefying' their semantic forms (i.e.
the semantic forms are unequal but structurally identical). I don't
see this as an important issue now, since the really crucial use HPSG
makes of having structurally isomorphic but distinct entities is with
`nominal indices' (roughly, logical variables in semantic
representations, which can either be operator-bound or else free and
`anchored' by the context of interpretation).
On the approach I am pursuing now, this sort of stuff would no longer
be in the syntactic representation, so there is no obstacle to
treating at least the morphosyntactic features via labelled
products.
So this is a move toward being more like LFG (at least how I think LFG
would be if the question I raised above were answered). However, on
this approach the grammatical-function attributes are treated via the
exponential constructor, not as projections of products, a move away
from both LFG and HPSG and toward categorial grammar. (Except that the
type theory is in the Curry-Church-Henkin tradition, not Lambek's
syntactic calculus.) Set values are just treated in terms of
powertypes (2^{A}), which is just as you (I mean you, Ron) want it.
>
I'm not sure whether I should be happy or sad if this means that we are
interpreting in a Henkin model.
>>
I think you should be happy if you can really get by with a Henkin
model, because it means you are sticking with your original goal of
doing everything with just sets: the labelled products are just
labelled cartesian products of sets and the exponentials (if you make
recourse to them) are just subsets of familiar set-theoretic `function
spaces' (sets of the form A^{B}). For example, `functor' (unsaturated)
categories like VP are interpreted LITERALLY as functions, not just as
arrows (categorical generalizations of functions) in some
`nonconcrete' categoty. It is not that models of higher-order logic
that are NOT Henkin models are inherently inferior, but they are much
harder to sell to the typical linguistic consumer who is not a
formalism freak.
>
It may be that we can get by with
simpler models (I presume the Henkin models are simpler than the
alternatives)
>>
Again, not necessarily simpler, just more familiar. In some respects
the topoi freely generated by the theoretical primitives are mathematically
simpler than Henkin models (their logics have simpler axioms), but
you don't have time in a twenty-minute paper to explain that to
a group of empirically-minded folk who are just trying to understand
how your latest theory of feature indeterminacy and coordnation of
unlikes works.
Actually I can't tell whether you are REALLY using Henkin models
unless you express LFG metatheory (at least the functional part) in a
typed lambda calculus (i.e. give a formal syntax for grammars
corresponding to the informal lambda terms I used in my comments above
on Dalrymple and Kaplan). Especially, as I said, I'm unclear about how
the codomains of the attributes are specified, and I'm unsure what
brand of type theory would be truest to your intentions in terms of
how minimal models come into it.
>
for feature structures because we've divided the
discriptive work in different ways--some recursion is in the
c-structure, for example.
>>
This is a fundamental issue that I am dealing with right now: how much
goes into the syntactic type theory and how much goes into the
labelled deductive system that provides the interface between the
category system and phonology and semantics.
>
But maybe I just don't understand the
problem. Can you say more about this?
>>
I have tried to say some about it above, but let me say a bit more
here. [NON-(FORMALISM-HACKERS) ARE ADVISED TO STOP READING HERE IF
YOU HAVE NOT ALREADY DONE SO.] The following is based on in-progress
unpublished joint research with Drew Moshier.
If you write the syntactic/functional/whatever-you-call-it part
of your grammar of English in a typed lambda calculus that contains
constants and type-constructors corrersponding to theoretical
primitives, then it is not known in advance whether the canonical
model of that theory IS a Henkin model. (For the reasons I mentioned
above, we HOPE it is, but you still have to prove it.) The problem,
roughly, is that the type theory generates an infinite number of
types, and we would like to be able to derive from the grammar a
simultaneous recursive definition of an infinite number of sets, one
for each type, a nontrivial task already; but then on top of that
there is the further question of whether the model based on these sets
really is canonical or whether the theory of it contains formulas
not deducible from the grammar.
If you are just dealing with finite named products (e.g. for
describing the system of parts of speech and morphosyntactic features)
then there is no serious problem, but once you treat grammatical
functions via exponentials (as in categorial grammar) rather than as
projections (as in HPSG/LFG), things can get tricky pretty
quickly. For example, imagine a toy language whose only (fully
specified) saturated categories are S, NP, and expletive IT and
THERE. Suppose there are no complementizers, so that all of S, NP, IT,
and THERE could be subjects. Then how do we represent the concept VP
in this type system, assuming that finite product, finite coproduct,
and exponential are the only type constructors? Well, as a first pass
we could say: let SUBJECT be the coproduct (in a Henkin model, just
disjoint union)
(1) SUBJECT \def= S + NP + IT + THERE
and then define VP as the exponential type
VP \= S^{SUBJECT}.
But what if VP's can also be subjects? Then evidently we can't
just *DEFINE* SUBJECT as a coproduct type; we would have to introduce
SUBJECT as a primitive type and make the type equation
(2) SUBJECT = S + NP + IT + THERE + S^{SUBJECT}
part of the grammar. This cannot be handled directly in Church-Henkin-Gallin
style higher order logic, but it can in the HOL of Lambek-Scott 1986,
which does not require that the type theory be freely generated by the
type constructors over the primitive types.
It gets worse. Consider the following toy `data' from this language,
which has no morphosyntactic features, but does have expletives and
raising to subject. (For familiarity suppose the language is SVO, but
I'm not going to say anything about how the syntax=phonology interface
works.)
(3)a. Kim rambles.
b.*It rambles.
c.*There rambles.
d.*Kim rambles rambles.
e.*rambles rambles.
(4)a. Kim bothers Sandy/*it/*there/*Chris rambles/*rambles.
b.*It bothers Sandy. (Remember IT is expletive, not a definite
pronoun.)
c.*There bothers Sandy.
d. Kim rambles bothers Sandy. (English translation: That Kim rambles
bothers Sandy.)
e. Rambles bothers Sandy. (English: To ramble bothers Sandy.)
(5)a.*Kim bothers Sandy Chris rambles.
b. It bothers Sandy Chris rambles. (Remember, no complementizers.)
c.*There bothers Sandy Chris rambles.
d.*Kim rambles bothers Sandy Chris rambles.
e.*Rambles bothers Sandy Chris rambles.
f. It bothers Sandy rambles. (English: It bothers Sandy to ramble.)
None of these verb (phrases) actually has the category VP. Rather:
(6)a. rambles: S^{NP}
b. bothers: S^{(NP+S+VP) x NP}
c. bothers: S^{IT x NP x (S+VP)}
That is, RAMBLES is an ordinary intransitive that takes an NP subject,
and BOTHERS has two subcategorizations, one as a transitive whose
subject can be either an NP or an S or a VP, and another with an
IT subject that extraposes an S or VP argument.
But now consider:
(7)a. It bothers Kim that Sandy rambles bothers Dana. (English:
It bothering Kim that Sandy rambles bothers Dana.)
b.*Bothers Kim that Sandy rambles bothers Dana.
c. There is unicorns rambles bothers Dana. (English: There being
unicorns rambling bothers Dana.)
d.*Is unicorns rambles bothers Dana.
The point is that not just any VP can be the theme argument of BOTHERS;
it has to be one whose missing subject is not an expletive. So the
lexical entry in (6)b is wrong, since the VP cofactor should really
have been
(8) S^{NP+S+VP}.
But even that isn't quite right, because the `VP' occurence in (8)
really shouldn't have been VP either, and for the same reason. Once
you start running into this kind of recursion it gets harder to give
recursive definitions of the sets corresponding to the types (the
homsets Hom(1,A)) or even to write the type specifications for
particular lexical entries. I'm not saying it is impossible, but the
math is no longer simple country boy math (to use a phase of yours).
For one last example of the kinds of complexities that can arise,
consider a raising-to-subject verb like TENDS which in HPSG1 has the
category (ignoring morphosyntactic features)
(9) tends: [HEAD verb, SUBCAT <[1], VP[SUBCAT<[1]>]
If we try to imitate this naively in higher-order logic we get
(10) S^{SUBJECT x S^{SUBJECT}}
but this fails to capture the analytic insight that the matrix and
embedded VP have the SAME subject.
We can get around this problem by
positing a syntactic entity TENDS1 (that is not provided with a
phononological interpretation by the syntax-phonology interface)
(11) TENDS \def= S^{S}
and then give the type of the word tends as
(12) \Pi_{X\in A} \lambda <x\in A, y\in S^{A}>.TENDS1'(y(x))
where, naively, A is the set of all coprime (not decomposable
into a nontrivial coproduct) types which occur as cofactors
when we expand out the type equation
(2) SUBJECT = S + NP + IT + THERE + S^{SUBJECT}
as
(2') SUBJECT = S + NP + IT + THERE + S^{S} + S^{NP} + ...
but (12) is a dependent product), which is not supported by standard
higher-order logic. Hopefully there is a minimal extension of
higher-order logic including such products that does not mean moving
into a foundation with full parametric types (which would be bad,
because the models of such type theories are not set-theoretic, at
least not as long as you keep the logic classical).
It would not have worked to give the word tends the simple-looking
type
(13) \lambda <x\in SUBJ, y\in S^{SUBJ}>.TENDS1'(y(x))
because then the only VPs that could be complements of
raising-to-subject verbs would be ones that could take ANY kind of
subject. (And there is no way to coerce a verb that is fussy about its
subject into one that isn't because the exponential functor is
contravariant on its `antecedent' (for categorial grammarians, `below
the slash') argument.
This is doubtless more than you wanted to hear, and perhaps the way
LFG factors the recursion is such that the problems I raise here do
not come up when you try to formalize LFG syntax type-theoretically.
But I have become convinced that linguistically it is better to
treat valence as exponentials rather than projections, so I am
stuck with these problems for the moment.
Carl
More information about the HPSG-L
mailing list