Question formal status of trees in H&GPSG

Ken Shan ken at digitas.harvard.edu
Fri Jun 27 20:25:48 UTC 2003


In this message, I explain the following slogan, then use it to clarify
(I hope) the formal status of trees in H&GPSG.

    Formal objects, such as trees in linguistics or numbers in
    mathematics, are individuated through and only through the
    operations available on them.

To take a non-linguistic example first, consider complex numbers in
mathematics.  I quote Reynolds (1983):

    Once upon a time, there was a university with a peculiar tenure
    policy.  All faculty were tenured, and could only be dismissed for
    moral turpitude.  What was peculiar was the definition of moral
    turpitude: making a false statement in class.  Needless to say,
    the university did not teach computer science.  However, it had a
    renowned department of mathematics.

    One semester, there was such a large enrollment in complex variables
    that two sections were scheduled.  In one section, Professor
    Descartes announced that a complex number was an ordered pair
    of reals, and that two complex numbers were equal when their
    corresponding components were equal.  He went on to explain how
    to convert reals into complex numbers, what "i" was, how to add,
    multiply, and conjugate complex numbers, and how to find their
    magnitude.

    In the other section, Professor Bessel announced that a complex
    number was an ordered pair of reals the first of which was
    nonnegative, and that two complex numbers were equal if their first
    components were equal and either the first components were zero or
    the second components differed by a multiple of 2pi.  He then told
    an entirely different story about converting reals, "i", addition,
    multiplication, conjugation, and magnitude.

    Then, after the first classes, an unfortunate mistake in the
    registrar's office caused the two sections to be interchanged.
    Despite this, neither Descartes nor Bessel ever committed moral
    turpitude, even though each was judged by the other's definitions.
    The reason was that they both had an intuitive understanding of
    type.  Having defined complex numbers and the primitive operations
    upon them, thereafter they spoke at a level of abstraction that
    encompassed both of their definitions.

The moral of this fable, for my purposes, is that the effective content
of a formal object is determined exactly by what operations on them a
theory makes available.  In the case of complex numbers, the following
operations are available (among others).  (I assume that we already know
what real numbers and truth values are.)

    OPERATION		INPUT			OUTPUT
    ---------		-----			------
    equal		<complex>, <complex>	<truth value>
    convert from	<real>			<complex>
    "i"			none			<complex>
    add			<complex>, <complex>	<complex>
    multiply	    	<complex>, <complex>	<complex>
    conjugate	    	<complex>		<complex>
    magnitude		<complex>		<real>

Any set of objects with these operations on them (satisfying certain
laws) is as good as any other as an answer to the question "what are
complex numbers?".  If two such "complex numbers" behave identically
with respect to all of these operations -- for example, they have the
same magnitude -- then they -are- the same complex number.  It makes as
much sense to speak of "the first component" of a complex number, or of
a complex number "as an ordered pair", as it does to speak of the length
of a time zone or the weight of the French government.  When it comes to
formal objects, we care only about what is observable (using available
operations).

This "algebraic" view of formal objects is a standard one in mathematics
and computer science.  Turning now to linguistics, I claim that to
specify the formal status of trees is precisely to specify what
operations are available on phrases.  For instance, context-free grammar
makes available on phrases the following operations and no other.  (I
assume that we already know what strings and categories are.)

    OPERATION		INPUT			OUTPUT
    ---------		-----			------
    convert from	<category>, <string>	<phrase>
    string		<phrase>		<string>
    category	    	<phrase>		<category>

Taking these operations as primitives, we can define a concatenation
operation on phrases as follows: Given a finite sequence of phrases

    <phrase1>, <phrase2>, ...

and a category

    <category>,

we define the concatenation to be the phrase

    convert-from( <category>,
	concatenate(string(<phrase1>), string(<phrase2>), ...) ),

in which "concatenate" means string concatenation.  Given a phrase built
this way or otherwise, the only observations we can make of it are its
string ("yield"; "frontier") and its category label.  It makes as much
sense for a grammatical rule in this system to ask what the constituents
of a phrase are as it does for a telephone company to charge a call to
555-MOVE differently from a call to 555-NOTE.

What operations on phrases are available to the syntactic rules of
"early transformational grammars"?  As indicated in your original
message ("trees were viewed as shorthands for derivational history"),
the operations available there are still the three operations listed
above.  The difference between "early transformational grammars" and
context-free grammars is that the former makes available a further
operation on strings -- the grammar not only gets to concatenate strings
as in the formula displayed above, but also can inspect and rearrange
strings by matching them against certain patterns.  "Sometime later",
transformations started to operate on trees rather than strings; in
other words, additional operations on phrases started to be available
that made it meaningful for a grammatical rule to ask, for instance,
what the constituents of a phrase are.

What operations on phrases are available in GPSG and HPSG?  I hope
I have explained what I mean by an operation on phrases well enough
that others more knowledgeable will be able to answer this question
in more detail, but let me give a first-cut approximation, sure to be
inaccurate.  HPSG phrases are a special case of HPSG feature structures,
so let us consider the operations on feature structures that HPSG makes
available to grammatical rules.  (I assume we know what attributes are
-- they are pretty much atomic entities on which the only operation
available is comparison for equality.)

    OPERATION		INPUT			OUTPUT
    ---------		-----			------
    unify		<fs>, <fs>		<fs> or failure
    embed		<attribute>, <fs>	<fs>

These operations are pretty powerful.  For example, if I want to check
if a certain feature structure <fs> has the value SINGULAR under its
NUMBER attribute, I can simply check whether

    unify(<fs>, embed(NUMBER, SINGULAR))

succeeds or fails.

In the above, I have glossed over some consequential details, such
as what it means to observe a truth value, or to observe an "<fs> or
failure".  Anyway, returning to your question, which distinguishes
between the following options for a notion of domination.

On 2003-06-26T16:45:42-0700, Andrew Carnie wrote:
> 	-X  *replace* Y and Z (as in  TG); meaning that the representation
> 		contains only Y and Z; X is a historial artifact (or vice
> 		versa, the representation contains only X, and Y and Z are
> 		historical artifacts.
> 	- or does X *contain* Y and Z (as in MP), there is one object in
> 		the representation (X), which contains all the material
> 		formerly in Y and Z. Y and Z no long exist except
> 		derivationally
> 	- or does it *represent* Y and Z (as in GB), X, Y and Z are all
> 		identifiable objects in the representation (and
> 		derivation). They are related through structural relations.
> 	-  or something entirely different?

To clarify what you mean by "replace", "contain", and "represent",
let me ask you the following question: Can you give an example of a
grammatical rule that would "make sense" (i.e., use only available
operations on phrases) if X contains Y and Z, but not if X represents Y
and Z, or vice versa?  It would clarify for me the difference between
domination in MP and in GB.

Cheers,
	Ken

John C. Reynolds. 1983.  Types, abstraction and parametric polymorphism.
In _Information processing 83: Proceedings of the IFIP 9th world
computer congress_, ed. R. E. A. Mason, 513-523.  Amsterdam: Elsevier
Science.  ftp://ftp.cs.cmu.edu/user/jcr/typesabpara.pdf

-- 
Edit this signature at http://www.digitas.harvard.edu/cgi-bin/ken/sig
*Harry Potter is a sexist neo-conservative autocrat.--P Bruno; ISBN 1859846661
*Return junk mail in the postage-paid response envelope included.
*Our aim should not be to abolish the WTO, but to transform it
 http://www.guardian.co.uk/globalisation/story/0,7369,983684,00.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://listserv.linguistlist.org/pipermail/hpsg-l/attachments/20030627/88dabf63/attachment.sig>


More information about the HPSG-L mailing list