Prevailing approaches do not have a computational lexicon
Mark Johnson
Mark_Johnson at Brown.edu
Thu Oct 10 13:46:13 UTC 2002
Hi Carl,
Thanks for the quick, detailed response! Somebody should write a short
monograph recording all of these ideas!
If I can try to summarize the idea at a high level, I think it's that
the formal mechanisms of the grammar must distinguish somehow substrings
(phrases) that are neutral or "overspecified" with respect to some
feature (like "Frauen") from substrings that are syntactically ambiguous.
One way to do this is to keep the feature mechanisms separate from the
combinatory syntactic mechanisms, which is what LFG does. In LFG one
can't combine the features from two different parses of a substring, so
there is simply no way for whatever mechanism is used to handle
feauturally neutral or overspecified (be it set-valued features in Ron
and Mary's analysis, or features as resources as in R-LFG) to be
confused with syntactic ambiguity. That is, the problem simply does not
arise in theories like LFG.
Of course, one can argue that it is intellectually cleaner to have a
single logical system that incorporates all of the grammar, rather than
having distinct but coupled mechanisms the way that LFG does. (These
days I'm agnostic on this -- I think that the proof of the pudding is in
the eating -- I'm in favour of whatever leads to the greatest insights
into language and processing, and I think that there's little to be
gained by arguing that one approach is a priori better than another).
Anyway, in such a "single logic approach", the fact that neutral or
overspecified phrases behave differently to ambiguous substrings shows
that they must be given different analyses some how. There are several
ways of doing this, but the ones you and I have discussed here all boil
down to saying that there are different kinds of binary "conjunction"
operators; one is used to describe string ambiguity (i.e., this string
belongs both to category C1 and C2), and a different one is used to
describe neutral or overspecified categories (this phrase can
simultaneously satisfy requirements for features F1 and F2). I think
this is just an instance of what Moortgat and Oehrle would call a
multi-modal logic (here, a logic with several kinds of boolean
connective, each of which behaves like conjunction).
Best,
Mark
Carl Pollard wrote:
>Hi Mark,
>
>
>
>Hi Carl,
>
>Is the problem as follows? Suppose a substring is ambiguous; it has one
>analysis as an NP ACC, and another completely different one as an NP
>DAT. In a type-logical system or similar in which the feature structure
>logic (specifically, disjunction and conjunction) is tightly integrated
>with the c-structure (so to speak) so that there's only really one
>logic, one would then be able to prove that the string also derives the
>"over-specified" category NP DAT /\ ACC. But unfortunately ambiguous
>strings don't behave like the lexically neutral forms "Frauen", showing
>that the "/\" cannot be regular logical conjunction. (But it still
>could be a different binary operator in a multi-modal logic, couldn't it?)
>
>
>
>Let me restate the problem. Suppose you are using a logic of syntactic
>types with type constructors \meet and \join such that the "prosodic
>interpretation" of these constructors is intersection and union in a
>set of possible prosodic entities (usually this set is taken to be a
>free monoid whose generators are thought of as prosodic words, but
>this is inessential). Suppose also that you try to distinguish
>ambiguity from neutrality/syncretism in the following way: your way of
>saying that the prosodic entity foo is ambiguous between being an A
>and being a B is to give two lexical entries <foo, A> and <foo, B>;
>whereas your way of saying that foo is neutral between being an A and
>being a B is to give one lexical entry <foo, A \meet B>.
>
>The problem is that these have the same prosodic interpretation, namely
>that foo is in the intersection X \intersect Y where X is the prosodic
>interpretation of A and Y is the prosodic interpretation of B.
>
>The Whitman/Morrill solution is to say that in the case of ambiguity,
>they weren't really both foo after all, but one was foo1 and the other
>foo2, that is, the identity criterion for prosodic structures is
>stronger than mere homophony.
>
>
>
>The reason why I think that LFG and R-LFG don't suffer from this is that
>each distinct c-structure is associated with its own f-structure; an
>f-structure is formed from the constraints from exactly one c-structure,
>so this merging of features from different c-structures simply cannot occur.
>
>Have I got it right (more or less)?
>
>
>
>That depends how you analyze neutrality. How does that work in R-LFG?
>If I remember right how Mary and Ron did it, if you had a homophonous
>pronoun that was ambiguous (but not neutral) between nominative and
>accusative, then the lexical entries would have distinct F-structures
>with different CASE values, but if you had a pronoun that was neutral
>between accusative and genitive, then there would be just one lexical
>entry whose F-structure was a set (or maybe just its case value is a
>set). That sounds pretty close to what you said above, doesn't it?
>
>
>
>in fact I think that LFG
>and R-LFG don't suffer from it precisely because c(onstituent)-structure
>isn't integrated into the logic of features (i.e., there isn't a single
>"logic" of all of LFG, but instead it consists of a heterogenous
>collection of different but coupled "logics").
>
>
>
>That sounds right too. In standard TLG the problem is that the
>prosodic logic is too much like the syntactic logic: they are
>connected by an algebraic homomorphism (it is a residuated lattice
>homomorphism, where the two residuated lattices are the lindenbaum
>algebra of the type logic (the domain) and the powerset of a free
>monoid (the codomain).
>
>The original critique of this setup goes all the way back to a 1961
>paper by Haskell Curry. His view was essentially that Lambek was
>conflating two levels of structure that Curry thought should be
>clearly distinguished: what he called phenogrammar and tectogrammar.
>David Dowty borrowed these terms into his version of categorial
>grammar in his paper given at the 1989 Tilburg conference on
>discontinuous constituency, and I think both Mike Reape and Andreas
>Kathol also used these terms in their systems (Mike's was more like
>CG, Andreas' a kind of HPSG). I remember suggesting to Ron once that
>the distinction in LFG between c-structure and f-structure was similar
>to Curry's distinction, but as I recall he didn't think so.
>
>Sometimes you see favorable citations of Curry's paper in the TLG
>literature, as if he were an advocate of something like TLG, but I
>think that is a misreading. However, Curry's own proposal still used
>type theory for the tectogrammar -- traditional (i.e. Curry's) type
>theory of course, not Lambek's.
>
>There's a simple way to apply these ideas to get a type theory that is
>reminiscent of LFG. I'd like to go back and look at R-LFG and see if
>it is similar in this respect. The basic idea is that the analog
>of TLG's functional types are types of the form
>
> [F1 [], ... Fn []] => [G1 [], ..., Gn []]
>
>where the Fi are things like SUBJ and the Gj are things like CASE. The
>things on both sides of the arrow are labelled products (i.e. static
>record types) and the arrow is the standard cartesian (not linear)
>exponential (intuitionistic implication under the Curry-Howard
>isomorphism, so grammatical functions are contravariant but inherent
>features are covariant). These formulas are types; the analogs of
>actual f-structures are terms (proof encodings) of these types. You
>also bring in coproduct for disjunctive selection, and a primitive
>Bool type, for a type-logical analog of "set values" (that is, the
>analog of "set of A's" is the powertype Pow(A) \def= A => Bool). The
>resulting logic is essentially Lambek and Scott's (1986) higher-order
>intutitionistic logic. (So the models are toposes.)
>
>This kind of type logic has all lambda-definable subtypes, so
>in particular analogs of intersection and union are definable
>(but the powertypes are only heyting algebras, not boolean unless
>you add a boolean axiom). In this setup, a term that is ambiguous
>between A and B is of type A x B, but a term that is neutral between
>A and B is of type A \intersect B (the intersection is actually
>defined as a subtype of the coproduct A + B). A term that
>selects ambiguously for A or B has an implicative type whose
>antecdent is Pow(A) + Pow(B) (coproduct of powertypes); whereas
>a term that selects neutrally for A or B has a type whose
>antecedent is Pow(A + B) (powertype of coproducts).
>
>I suspect what Mary and Ron did is closely related to what I just
>described, implemented in sets, but the logic of it is much easier to
>grasp if you use the => to divide f-structures into their covariant
>and contravariant parts.
>
>So the tectogrammar is based on a type logic, but not a resource
>sensitive one. The appearance of resource sensitivity comes from the
>way the grammar (think of something like a Montague grammar or a CCG)
>specifies which triples <p, s, m> are "in", which is a kind of
>labelled deductive system. Here, as expected, p is a prosodic entity
>and m is a meaning (or a term in a semantic lambda calculus); however
>s is not a syntactic type, but rather a syntactic term.
>
>Carl
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/lfg/attachments/20021010/430e5da8/attachment.htm>
More information about the LFG
mailing list