Glue language notation thoughts

Sat Nov 16 02:17:55 UTC 2002

Hi Avery,

It's good to know that people are looking into glue implementations.

One of the problems with notation is, that unless it is really
difficult to use, you tend to prefer what you are used to.
In my experience of both implementing glue provers and writing
lexical entries, I have a strong preference for something as close as
possible to the `new', Curry-Howard glue notation.  It forces you to
remain conceptually clear about the strict separation between the
meaning language and the linear logic.  This makes
it much easier to debug the lexicon when for some reason you don't
get the semantic derivations you expected.  It also makes it much
easier to switch to a new meaning language (through use of macros) while
preserving the linear logic resource management.  I also find it
pretty readable.

Here are some examples, making liberal use of macros

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

%%%%%%%%%%%%%%
% VERBS
%%%%%%%%%%%%%%

macro( vtrans(P),
       % This defines a macro for constructors for transitive verbs.
       %  P          is the 2-place predicate
       %  :          separates the meaning language from the linear logic
       %  sig(^ obj) refers to the semantic projection of up's object
       %  $e         is an explicit typing on on this resource
       %                e = entity,  t = truth-value
       %  -o         is linear implication
       %  sig(^)     refers to up's semantic projection

   P:  sig(^ obj)$e  -o  sig(^ subj)$e  -o  sig(^))$t
).

% application of vtrans macro to various semantic predicates.
% @ marks a macro call.
% Typically, these calls would be embedded in a larger lexical entry,
% here they are shown pulled out

@vtrans(like).
@vtrans(hate).

%%%%%%%%%%%%%%
% COMMON NOUNS
%%%%%%%%%%%%%%

macro( cn(P),
       % macro for common nouns
       % (sig(^) var)$e is the var of up's semantic projection,
       %                declared to be of type e

   P:  (sig(^) var)$e  -o  (sig(^) restr)$t
).

@cn(dog).
@cn(man).

%%%%%%%%%%%%%%
% DETERMINERS
%%%%%%%%%%%%%%

% This example shows how you might give alternative representations of
% quantifiers.   Note that the linear logic component remains constant

macro( det_glue,
     % macro for linear logic part of determiner entries
     %  &(H$t, <body>) indicates universal quantification over some
     %                 type t resource, H

     &(H$t,
           (( (sig(^) var)$e  -o  (sig(^) restr)$t )
             -o
             sig(^)$e -o H$t
           )
).

macro( gen_quant_det(Q),
     % macro for determiners under a generalized quantifier meaning lg
     %  &(H$t, <body>) indicates universal quantification over some
     %                 type t resource, H
     % This calls the det_glue macro to plug in the linear logic formula

     Q:  @det_glue
).

macro( fol_quant_det(Q),
     % Alternative meaning language for quantifiers, making bound
     % variable X explicit
     %  \R<expr>  indicates a lambda abstraction of R over <expr>

     \R\B (Q, X, R(X), B(X)) :  @det_glue
).

@gen_quant_det(most).
@fol_quant_det(forall).

%%%%%%%%%%%%%%
% ADJECTIVES
%%%%%%%%%%%%%%

macro( mod_glue(G),  G -o G).
       % general linear logic form of a modifier

macro( mfyee,  sig(adjunct e ^) ).
       % pick out what is being modified by some adjunct (the modifyee)

macro( insective_adj(A),
  % this macro introduces a set of two meaning constructors
  % see Dalrymple (2000:266) for why adjectives lead to two constructors

  A: (sig(^) var)$e -o sig(^)$t

  \P\Q\X  P(X) and Q(X):
     ((sig(^) var)$e -o sig(^)$t)
     -o
     @mod_glue( (@mfyee var)$e  -o  (@mfyee restr)$t )
).

@intersective_adj(swedish).

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

I'm quite willing to concede that readability is in part a matter of
what you are used to.  But I don't think the above notation is too
bad. (To come clean, it's an idealization of what we currently have
implemented, which has various historical lumps and crevices).

I have a few questions and comments about your proposed notation.

First, it seems to have dispensed with the explicit typing of
resources as either type e or type t.  This information is actually
necessary, otherwise you'll find quantifiers scoping in all sorts of
places where they shouldn't, and taking type e resources as their
body. This is of course readily fixed, and I assume was left out for
presentational reasons.

Second, I take it that the principal motivation behind your notation was
to deal with `new' glue's "correlation between an arbitrary order of
lambda abstractions on the left, and the order of implications on the
right."  This correlated separation is in fact an important component
of the new glue.  For one thing, it makes it clear that there is
nothing magical about argument order, and that it really is arbitrary
whether, e.g., you consume subject arguments before object arguments
in a glue derivation, or vice versa.  (This is not to say that there
are not things like grammatical role hierarchies, only that they are
not significant to how lexical entries get written or used in a glue
derivation).

I think there are two dangers with notation like

   like(@f_sig, @g_sig) : h

(1) It blurs the separation between the meaning language and the
    linear logic glue.  This separation allows one to look at glue
    derivations as abstract objects in their own right, independently
    of the details of any meaning language.  This not only leads to
    more efficient and general implementations, but Ash Asudeh and I
    have also been looking at the structure of these derivations as a
    way of assessing semantic parallelism in ellipsis and coordination
    independently of particular meanings.

(2) The notation also tempts one into thinking that there really is
    some significance to the ordering of, say, subjects and objects in
    glue derivations.

The business of arbitrary argument ordering maybe needs spelling out
--- it certainly had me confused at first.  Lets take three possible
constructors for a transitive verb such as "likes", where we will use
subj and obj to represent the subject and object resources

   like:  obj -o (subj -o f)

   \Y\X like(X,Y): obj -o (subj -o f)

   \X\Y like(X,Y): subj -o (obj -o f)

These three constructors are all inter-derivable given the standard
rules of inference for linear logic.  In other words, it doesn't
matter which one you write down.   To get the third constructor from the
second, the derivation goes as follows:

\Y\X like(X,Y): obj -o (subj -o f)     [y:obj]^1
----------------------------------------------
         \X like(X,y): (subj -o f)            [x:subj]^2
         ---------------------------------------------
                 like(x,y): f
          ------------------------ discharge assumption 1
          \y like(x,y): obj -o f
     ---------------------------------- discharge assumption 2
     \x\y like(x,y): subj -o (obj -o f)

To get 2 from 1

like:  obj -o (subj -o f)  [y:obj]^1
------------------------------------
     like(y): (subj -o f)           [x:subj]^2
     -----------------------------------------
            like(y)(x): f
       ---------------------------  discharge 2
        \x like(y)(x): (subj -o f)
   -------------------------------------  discharge 2
    \y \x like(y)(x): obj -o (subj -o f)

subject the (standard) notational convention about functional
application that p(y)(x) can be also be written as p(x,y).

What all this reordering means is that semantic derivations can be
liberated from any concerns about where arguments fit into the meaning
representation. This makes things conceptually and implementationally
much cleaner.

The notation you give does not of course prevent this kind of
separation and re-ordering.  But it obscures it, without (I find) a
real increase in clarity.  In particular, if you were to perform glue
derivations on your lexical entries, you would probably find yourself
(i) separating them out into meaning and glue, (ii) doing
propositional linear logic inference on the glue side, and (iii)
assembling the meaning terms via the Curry-Howard isomorphism applied
to the glue derivation.  For debugging purposes, this would make it
hard to relate the actual (separated) derivations to the (mixed)
lexical entries that the grammar writer has provided.   And for
anything other than toy examples, debugging is important...

Hope these comments (some of which are born of bitter experience) are
some use.

Regards,

Dick Crouch

--------------------------------------------------------------------
Avery Andrews wrote:

Playing around with glue semantics notation (for the ultimate purpose of
including glue semantics in a new version of my Windows LFG system),
I'm wondering if people have looked into notational alternatives that
might be a bit more readable.  In particular, the 'new glue' notation has
an a visually hard-to-parse correlation between an arbitrary order of
lambda abstractions on the left, and the order of implications on the
right; I think it would be an improvement to replace this with direct
reference to the semantic projections inside the meaning, so that for
example the constructor for 'like' might look like this, using
underscripting:

 like(X,    Y) : h
    f_sig  g_sig

Or in a typographically linear format, maybe something like this:

 like(@f_sig, @g_sig) : h

Or when we need to make multiple use of one f-structure reference:

 shave(@f_sig:X, X)

(I'm not sure if this would ever actually be useful).

The benefit is that the irrelevant linear order is eliminated from
the notation, and the relationship between the variable positions
in the semantic structure and the f-strucure locations and semantic
types of their fillers is represented by the immediate and direct
one of typographical co-location.

A uninstantiated common noun meaning would look like this:

  dog(@(^_sig var)) : @(^sig restr)

Quantificational determiners will still look horrible, but could
be improved by the use of macros:

 CN    = (^_sig var -o ^_sig restr)
 Q[V]  = (^_sig -o V)

So we would get:

 every(X, @CN(X), @Q[H_sig](X)) : H_sig

or:

 every(@CN, @Q[H_sig]) : H_sig

which aren't too bad.

And perhaps the explicit parameterization of the Q macro could be elimated
with the use of a symbol, say, '*', designating whatever appears to the
right of the ':'.  Then we could improve quantifiers somewhat, and
do an operator adjective this way:

  alleged(@*) : f_sig var -o f_sig restr

The first meaning constructor for an intersective adjective
would look like this:

  Swedish(@^_sem var) : @^_sem

While the second one would look like this:

 lm X.@(^sem var -o ^sem)(X) and @*(X)
   : ((adj e ^)_sig var) -o (adj e ^)_sig cond)

which isn't wonderful, but I think a bit more readable than (33) on
Dalrymple (2000:266).

The lambda abstraction could be dispensed with if there is appropriate
polymorphism for 'and', so that it apply to type e->t as well as t in the
natural way, and the use of iofu could be eliminated if the constructor
were introduced by the c-structure rules rather than the lexicon:

  NP  ->    .... AP  .....

   @(v_sem var -o v_sem) and @*:@CN

I find it sort of intriguing how the * notation and the macro fit
together.

Well anyway I'd appreciate hearing if people working on implementations
have already figured out something to this general effect.

 - Avery Andrews