coherence relations between large spans of text
Gisela Redeker
g.redeker at RUG.NL
Thu Apr 16 17:12:44 UTC 2009
I agree with Ed, but like to add what I had answered to Anna (privately, as
she had proposed that she'd post a summary), i.e. that schemata focus on the
function of the texts parts in the generic structure, whereas RST relations
foreground the relatedness (or, in the case of Joint, lack thereof) of text
spans. I add here what I wrote last Friday:
-----Original Message-----
From: Gisela Redeker [mailto:g.redeker at rug.nl]
Sent: Friday, April 10, 2009 9:33 PM
To: 'Anna Kazantseva'
Subject: RE: [RST-LIST] coherence relations between large spans of text
Dear Anna,
Global relations are indeed often harder to agree on than local ones. There
are various alternatives to RST-relations at the top levels of textual
structure. What e.g. Daniel Marcu has used are 'schemata'. Other researchers
(not as far as I know in Computational Linguistics) have proposed 'moves' as
top-level elements (e.g. Biber, Connor & Upton 2007: "Discourse on the Move:
Using Corpus Analysis to Describe Discourse Structure." Amsterdam:
Benjamins). What these 'genre structure' approaches have in common is that
they identify units with a certain content a/o function, not relations. In a
research program I am coordinating (http://www.let.rug.nl/mto), we are
combining move analysis with RST (both done manually up to now to create a
reference corpus). The mapping of top-level RST units onto moves usually is
very straightforward. But I must note that our texts are quite short (a few
hundred words).
For the purpose of automatic or computer-assisted analysis of long
narratives, some sort of segmentation might be useful. The determine episode
boundaries, a segmentation algorithm based on cohesion analysis might help
(see, e.g., N. Stokes 2004: " Applications of Lexical Cohesion Analysis in
the Topic Detection and Tracking Domain" Diss. Dublin). For a
linguistic/discourse-analytic account of topic structure see Goutsos 1997:
"Modelling discourse topic: sequential relations and strategies in
expository text." Benjamins.
Best wishes,
Gisela
Gisela Redeker, Professor of Communication Studies
Department of Communication and Information Sciences
University of Groningen
P.O. Box 716, NL-9700 AS Groningen, The Netherlands
g.redeker at rug.nl tel: +31-50-3635973 fax: +31-50-3636855
http://www.let.rug.nl/~redeker
> -----Original Message-----
> From: RST Discussion List [mailto:rstlist at LISTSERV.LINGUISTLIST.ORG] On
> Behalf Of Eduard Hovy
> Sent: Thursday, April 16, 2009 6:17 PM
> To: RSTLIST at LISTSERV.LINGUISTLIST.ORG
> Subject: Re: [RST-LIST] coherence relations between large spans of text
>
> Hello Ken,
>
> At 8:44 AM +0600 4/14/09, Ken Keyes wrote:
> >I am intrigued (again) by Mick's comment about modeling the schemas for
> >texts.
> >...So, mononuclear relations may not be a good way of modeling. Mick, are
> you
> >saying that multinuclear relations are more appropriate?
>
> It's probably helpful to remember the difference between schemas and
> relations:
>
> Schemas are generally larger frameworks, consisting of several
> elements in [often fixed] sequence, in which each element has a
> function that contributes [equally] to the overall whole. The
> functions may themselves be structures (hence, filled by other
> schemas).
>
> Relations are smaller, linking [generally] only two pieces of text
> together. Generally, one of the two is primary (in RST called the
> Nucleus). But a few RST relations (notably Joint) were defined as
> multi-nuclear: not two, but many, components, all co-equal. One
> doesn't have to care about the fact that one of the two branches of a
> relation is dominant, but taking this into account allows you to do
> things like shorten the text (dropping out the non-dominant parts),
> identify the most 'important' parts of a text, etc.
>
> Note that one can often 'decompose' a schema into its constituent
> (tree of) relations. In this view a schema is nothing more than a
> frozen stereotypical structure of relations.
>
> Regards,
> E
>
> --
> Eduard Hovy
> email: hovy at isi.edu USC Information Sciences Institute
> tel: 310-448-8731 4676 Admiralty Way
> fax: 310-823-6714 Marina del Rey, CA 90292-6695
> http://www.isi.edu/natural-language/nlp-at-isi.html
More information about the Rstlist
mailing list