RST and dialogue

Thu Nov 23 14:28:46 UTC 2000

To the RST discussion list members:

I frequently encounter comments, questions or projects concerned with the
application of RST to dialogue.  Most recently, we saw a message about
application of RST ideas to computed facial expressions in interaction.
After the message
went to the list, Andrew Marshall and I exchanged additional messages, and
now I think it might be useful to share some of those ideas.

Speaking of a pilot project involving a computational agent, Andrew noted
"What struck me about RST is the surprisingly simple mapping between the
initial RST categories and all of our agent's major utterances.  ...
[Possibly]
prosody markup and didactic gestures could be derived from the RST structure
of [the agent's] longer utterances." [edited.]

I find this suggestion interesting, but I am impressed with the difficulty
of trying to find simple extensions of RST to make it cover dialogue.
(This, of course, is not what they are attempting.)

There are many different sorts of things that RST lacks and a dialogue
approach needs.  Some of them are pretty fundamental.  In addition, if
someone maps RST with extensions onto dialogue, that mapping needs to be
empirically defensible.

RST is defined in a way that makes it a descriptive approach rather than an
explanatory approach.  While RST helps in identifying things that go on in
coherent monologue text, it does not say how or why they occur.  (An
unpublished paper on this is on www-rcf.usc.edu/~billmann/ . Look for the
phrase "Recent Papers," and then a paper about views of RST.)

The need for an empirically defensible extension constrains how the
extensions are made and eventually tested.  A direct extension for dialogue
would extend the observers' framework.

What does RST lack for dialogue?

There is no notion of negotiation of focus or topic of discussion.

There is no notion of speaker's role, status or identity.

There is no notion of complementary intentions.

There is no notion of conventionalized complementary intentions, which are
needed to account for recurrent patterns, e.g. in tutoring.

There is no notion of mutual pursuit of intentions, parties working on the
same general goals at the same time, and so also no notion of how the mutual
pursuit of intentions terminates in a coordinated way.

There is no well developed notion of dialogue coherence, and the monologue
notion of coherence used with RST is too strong.

There is no notion of any negotiation of world view, assumptions, doubts or
counterarguments, other than a minor provision for Concession.

There is no notion of explicit negotiations of status, role, or right to
speak.

There is no notion of negotiating the right to be heard or obeyed or
respected or to receive the language of honor.

There are no distinctions between interruption and abandonment of lines of
talk.

There is no provision for the so-called "repair" exchanges (of
Conversational Analysis).

There is no provision for overlapping speech.

(There is more.  I have extracted these from a larger list.)

With all of that, we have not mentioned visual and auditory parts, nor made
any provision for more than 2 parties. The move from 2 parties to more than
2 is very demanding.

So, I am more hopeful for an approach to dialogue that studies RST
carefully, sets it
aside and starts over.  I personally feel that fresh work on describing
dialogue is very timely.  Good work has been done, and more is needed.  I
suspect that any strong descriptive work on dialogue will greatly benefit
explanatory work and computational developments.

Bill Mann