Relations that are seldom or never signaled

Daniel Marcu marcu at ISI.EDU
Tue Jan 11 17:34:37 UTC 2000


> At least, there is a certain subset of relations which are
> seldom signalled _using connectives_; there may be other cohesive methods
> of signalling them in surface text, as Simon Corston pointed out in an
> earlier message.
>

I wholeheartedly subscribe to this view as well.

> I've been interested for several years in a methodological approach
> whereby the set of linguistic mechanisms available for signalling
> discourse relations in a language is used as a source of evidence about
> how those relations should be defined. To take a simple example, there are
> several RST relations that can be signalled using the connective `but';
> this suggests that these relations have something in common, and we would
> like this commonality to be reflected in the formal definitions of the
> relations concerned.

Hmmm... The word "bank" means different things in different contexts.
And semanticists do not suggest that the different senses should have
anything in common. Why is "but" different in that respect? Why are
connectives different, in general? Isn't it possible that a connective
has multiple "senses" the same way an open class word has?


>
> Before going further, I think that one important distinction to make is
> between relations which are _not often_ signalled by connectives, and
> relations which _cannot_ be signalled by connectives. Some of the
> relations in Bill's list (e.g. EVIDENCE, JUSTIFY, INTERPRETATION,
> RESTATEMENT, LIST) are sometimes signalled by connectives. My feeling is
> that there are only a couple of relations that are never signallable by
> connectives: one is ELABORATION (or rather, the subtype of elaboration
> called OBJECT-ATTRIBUTE ELABORATION), and the other is BACKGROUND. [I'd
> be interested in other RST analysts' opinions on this, though.] So a
> prediction made by the methodology outlined above is that these two
> relations fall into a different class than the others, and that
> this will be reflected in their definitions.
>

I agree with you here. Elaboration is rarely marked explicitly. Segal et
al [91]'s psycholinguistic experiments suggest that readers have a
preference for interpreting unmarked textual units as continuations of
the topics of the units that precede them. This means that if no marker
is used, it is reasonable to take Elaboration as a default relation. In
general, it may be a good idea to use Elaborations as defaults. In our
own experiments, we've found Elaboration to be the most frequently used
relation: depending on text genre, Elaborations accounted for between
13.8 and 21.6% of the relations we've annotated in a corpus of 90
discourse trees.


> I think there is a good case to be made for these relations being
> different from the others. My suggestion is that these relations are in
> fact better modelled using the metaphor of `focus', rather than the
> metaphor of `relations between propositions'. The idea is that what's
> described as an OBJECT-ATTRIBUTE ELABORATION could equally well be
> described as the maintenance of a given focus entity across two text
> spans, or a legal shift of focus from one entity to another. In fact, I
> would say that the focus-based description seems more appropriate. For
> instance, consider the text `I have a brother called Bill. He's 20 years
> old'. We could model the coherence of this text by noting that they are
> linked by an ELABORATION relation. But what we have here is only a
> `relation between two propositions' in a very derivative sense: the
> relationship only holds in virtue of the fact that both propositions
> include reference to a particular entity (Bill). In other words, an
> account of which entities are in focus seems to be primary in this case.
> (I think that a similar point can often be made for the BACKGROUND
> relation, although the case is not as clearcut.)
>
> Both `entities-in-focus' and `relations-between-propositions' have been
> used extensively to model the phenomenon of discourse coherence, and I
> think most theorists would agree that both metaphors are needed for a
> complete account of coherence. Often, relations-between-propositions and
> focus are construed as multiple simultaneous constraints on coherence, as
> for instance in Grosz and Sidner's model, or Hovy and McCoy's `Focussing
> your RST' paper.  However, if we decide that OBJECT-ATTRIBUTE ELABORATION
> and BACKGROUND are better thought of as reflections of focussing
> constraints, and we accept that an account of the focus structure of a
> text is something which is needed on independent grounds (for instance to
> model the pattern of anaphora in a text), then it seems redundant to
> continue to treat them as ordinary RST relations: the work they are doing
> is already being done by a different, and more appropriate metaphor.
>
> Of course, leaving OBJECT-ATTRIBUTE ELABORATION and BACKGROUND out of the
> set of RST relations has serious consequences: it means that it's no
> longer possible to build a complete tree of relations for every coherent
> text. In fact, since these relations are amongst the most common in RST
> analyses, it means that it will seldom be possible to build a complete
> tree. The picture that remains is one where a coherent text is described
> by one or more RST trees, with RST trees being linked by legal focussing
> moves in the case where there is more than one.
>
> Comments welcome on this subversive view! I should say that this model of
> text coherence is one that has been used with reasonable success in a text
> planning system (the ILEX system---for more details, see
> http://cirrus.dai.ed.ac.uk:8000/ilex/ ), though I should also say that I
> don't think that this kind of demonstration-by-implementation counts for
> very much.
>

Alistair, I agree with you that focusing plays a major role in text
coherence. But I don't think it can be used in order to define
relations. Consider the examples below, which are taken from the Times
and Scientific American respectively. My intuition is that in both
examples, the first span of text surrounded by square brackets is the
satellite of a Background or Justification relation whose nucleus is
given by the second span surrounded by square brackets.

I don't think focus helps in determining these relations. In the second
example, though, note that in the third paragraph, the writer switches
from using the past tense to using the present tense. This is a more
subtle form of marking, but is nevertheless marking!

My feeling is that linguistic work to date has focused primarily on
characterizing rhetorical relations that hold between relatively small
text spans. I feel that there is lots to be done if we want to scale up
our theories to larger texts.

Daniel

---------- Example 1 --------


FLU STOPPER

A NEW COMPOUND IS SET FOR HUMAN TESTING THIS YEAR

[Running nose. Raging fever. Aching joints. Splitting headache. Are
there any poor souls suffering from the flu this winter who haven't
longed for a pill to make it all go away? Relief may be in
sight.] [Researchers at Gilead Sciences, a pharmaceutical company in
Foster City, California, reported last week in the Journal of the
American Chemical Society that they have discovered a compound that
can stop the influenza virus from spreading in animals. Tests on
humans are set for later this year.]


-------- End Example 1 ----------


------- Example 2 ----------

                        Training the Olympic Athlete


        [The lore of ancient Greece recalls an Olympic athlete who was
determined to become the strongest person in the world. Every day
Milon of Croton would pick up a calf, raise it above his head and
carry it around a stable. As the calf grew, so did Milon's strength,
until eventually he was able to lift the full-grown cow.

        Milon, who won the wrestling contest five times,
intuitively grasped one of the basic tenets of contemporary sports
science.] [Progressive resistance training - the stressing of muscles
with
steadily increasing loads - is something well understood by the more
than 10,000 athletes from 197 countries who will go to Atlanta, Ga.,
next month for the centennial of the modern Olympic Games.

        During the past half century, however, sports science has
refined the basic principles of training beyond the understanding of
the Greeks. Exercise physiologists and coaches draw on new scientific
knowledge to help athletes develop a balance of muscular and metabolic
fitness for each of the 29 sports in the Olympic Games.  Biomechanical
experts employ computers, video and specialized sensors to study the
dynamics of movement. Design engineers incorporate advances in
materials and aerodynamics to fashion streamlined bobsleds or racing
bicycles. Sport psychologists build confidence through mental-training
techniques. The integration of these approaches affords the small
gains in performance that can translate into victory.]

-------- End Example 2 ----------




--
Daniel Marcu
Information Sciences Institute of University of Southern California
4676 Admiralty Way, Suite 1001; Marina del Rey, CA 90292-6601
Voice: 310-448-8726; Fax: 310-822-0751; www.isi.edu/~marcu/



More information about the Rstlist mailing list