17.3141, Diss: Computational Ling/Ling Theories: Buch-Kromann: 'Discontinuou...'
LINGUIST Network
linguist at LINGUISTLIST.ORG
Thu Oct 26 17:05:09 UTC 2006
LINGUIST List: Vol-17-3141. Thu Oct 26 2006. ISSN: 1068 - 4875.
Subject: 17.3141, Diss: Computational Ling/Ling Theories: Buch-Kromann: 'Discontinuou...'
Moderators: Anthony Aristar, Eastern Michigan U <aristar at linguistlist.org>
Helen Aristar-Dry, Eastern Michigan U <hdry at linguistlist.org>
Reviews: Laura Welcher, Rosetta Project / Long Now Foundation
<reviews at linguistlist.org>
Homepage: http://linguistlist.org/
The LINGUIST List is funded by Eastern Michigan University, Wayne
State University, and donations from subscribers and publishers.
Editor for this issue: Hannah Morales <hannah at linguistlist.org>
================================================================
To post to LINGUIST, use our convenient web form at
http://linguistlist.org/LL/posttolinguist.html.
===========================Directory==============================
1)
Date: 26-Oct-2006
From: Matthias Buch-Kromann < mtk.id at cbs.dk >
Subject: Discontinuous Grammar: A dependency-based model of human parsing and language learning
-------------------------Message 1 ----------------------------------
Date: Thu, 26 Oct 2006 13:03:43
From: Matthias Buch-Kromann < mtk.id at cbs.dk >
Subject: Discontinuous Grammar: A dependency-based model of human parsing and language learning
Institution: Copenhagen Business School
Program: Department of Computational Linguistics
Dissertation Status: Completed
Degree Date: 2006
Author: Matthias Buch-Kromann
Dissertation Title: Discontinuous Grammar: A dependency-based model of human
parsing and language learning
Dissertation URL: http://www.id.cbs.dk/~mtk/thesis
Linguistic Field(s): Computational Linguistics
Linguistic Theories
Dissertation Director(s):
Sabine Kirchmeier-Andersen
Carl Vikner
Dissertation Abstract:
In the dissertation, Matthias Buch-Kromann presents his dependency-based
grammar formalism, Discontinuous Grammar. The dissertation argues that
grammars should not only distinguish between grammatical and ungrammatical
linguistic analyses, but that they should assign a number (a cost) to the
individual words in both grammatical and ungrammatical analyses, so that
the cost measures the syntactic, semantic, and pragmatic well-formedness of
the individual words; in that way, the grammar can be used to precisely
localize linguistic errors in the analysis. In this setting, parsing,
generation and machine translation can be viewed as optimization problems
where the goal is to find the cheapest analysis that satisfies a given side
condition -- eg, that the analysis corresponds to a given text (parsing),
semantic representation (generation), or source text (machine translation).
The dissertation demonstrates how the proposed formalism deals with a wide
range of linguistic phenomena, including the complement and adjunct
distinction; discontinuous word orders and island constraints; control
constructions, relatives, and parasitic gaps; elliptic coordinations;
anaphora and discourse structure; punctuation; and inflectional and
derivational morphology. The dissertation also describes how these analyses
have formed the theoretical basis for the construction of the Danish
Dependency Treebank, a general purpose corpus for Danish with 100,000 words
equipped with complete dependency analyses.
The dissertation also proposes two methods, HPM and XHPM, for the
statistical estimation of hierarchically classifiable data such as words in
dependency relations, which can be classified according to word class and
ontological class. The dissertation moreover proposes a statistical
language model based on the proposed grammar formalism and estimation
method. Finally, the dissertation proposes a parsing algorithm, local
optimality parsing, which can be used in combination with a manual or
statistically induced grammar to segment and parse an entire discourse. The
dissertation argues that the parsing algorithm has a number of theoretical
advantages compared with other parsing algorithms, such as its speed (it
has an almost-linear time complexity) and its potential as a plausible
model of human parsing.
-----------------------------------------------------------
LINGUIST List: Vol-17-3141
More information about the LINGUIST
mailing list