14.1476, Diss: Pragmatics/Comp: Cresswell: "Syntactic form..."
    LINGUIST List 
    linguist at linguistlist.org
       
    Thu May 22 14:43:34 UTC 2003
    
    
  
LINGUIST List:  Vol-14-1476. Thu May 22 2003. ISSN: 1068-4875.
Subject: 14.1476, Diss: Pragmatics/Comp: Cresswell: "Syntactic form..."
Moderators: Anthony Aristar, Wayne State U.<aristar at linguistlist.org>
            Helen Dry, Eastern Michigan U. <hdry at linguistlist.org>
Reviews (reviews at linguistlist.org):
	Simin Karimi, U. of Arizona
	Terence Langendoen, U. of Arizona
Home Page:  http://linguistlist.org/
The LINGUIST List is funded by Eastern Michigan University, Wayne
State University, and donations from subscribers and publishers.
Editor for this issue: Naomi Fox <fox at linguistlist.org>
 ==========================================================================
To post to LINGUIST, use our convenient web form at
http://linguistlist.org/LL/posttolinguist.html.
=================================Directory=================================
1)
Date:  Wed, 21 May 2003 15:13:16 +0000
From:  creswell at BABEL.ling.upenn.edu
Subject:  Syntactic form and discourse function...
-------------------------------- Message 1 -------------------------------
Date:  Wed, 21 May 2003 15:13:16 +0000
From:  creswell at BABEL.ling.upenn.edu
Subject:  Syntactic form and discourse function...
Institution: University of Pennsylvania
Program: Department of Linguistics
Dissertation Status: Completed
Degree Date: 2003
Author: Cassandre Yvonne Creswell
Dissertation Title: Syntactic form and discourse function in natural
language generation
Dissertation URL: www.ling.upenn.edu/~creswell/diss.pdf
Linguistic Field: Pragmatics
		  Computational Linguistics
Subject Language: English (code: ENG)
Dissertation Director 1: Ellen F. Prince
Dissertation Director 2: Aravind K. Joshi
Dissertation Abstract:
Previous research has shown that certain discourse conditions are
necessary for the felicitous use of four non-canonical syntactic
constructions in English, topicalizations, left-dislocations,
wh-clefts, and it-clefts.  However, the distribution of these forms
does not correlate one-to-one with the presence of these necessary
conditions. Speakers must choose to use these constructions for other
reasons. Additionally, a natural language generation algorithm that
selects these statistically-rare forms based only on these conditions
will overgenerate. If it selects clausal word order based only on
frequency, however, these forms will never be selected or will be used
in meaningless ways.  The purpose of this dissertation is to devise a
more complete model of when human speakers generate these
constructions in order to further understanding of syntactic form
selection and to better characterize these forms' conditions of use
for purposes of NLG.  The model of syntactic choice presented
explicitly ties the goals of the communicative agent to the linguistic
forms selected to achieve those goals. Three types of communicative
goals that speakers achieve through the use of non-canonical syntax
are argued for (1) attention marking, (2) discourse relation, and (3)
information-structure focus disambiguation.  The evidence supporting
the model is based on naturally-occurring tokens from a corpus of
spontaneous oral discourse.  This same corpus, annotated with
low-level properties of the discourse context surrounding utterances
with non-canonical word order, is then used to train a statistical
model that can approximate some aspects of the theoretical model.  The
statistical model supports the claim that communicative goals of
signaling discourse relations do correlate significantly with the use
of particular non-canonical forms.  The statistical model is also used
as a probabilistic classifier, which could be utilized as a stochastic
method for selecting syntactic form based on discourse context as part
of a natural language generation system.  The probabilistic classifier
shows improvement over a naive classifier when applied to training
data.  The probabilistic classifier is a first attempt to utilize more
than just frequency counts as a basis for syntactic form selection and
instead incorporate aspects of the semantic content of surrounding
discourse context as a basis for using a particular form.
---------------------------------------------------------------------------
LINGUIST List: Vol-14-1476
    
    
More information about the LINGUIST
mailing list