13.1535, Diss: Computational Ling: Eisner "Smoothing..."
LINGUIST List
linguist at linguistlist.org
Tue May 28 21:17:26 UTC 2002
LINGUIST List: Vol-13-1535. Tue May 28 2002. ISSN: 1068-4875.
Subject: 13.1535, Diss: Computational Ling: Eisner "Smoothing..."
Moderators: Anthony Aristar, Wayne State U.<aristar at linguistlist.org>
Helen Dry, Eastern Michigan U. <hdry at linguistlist.org>
Reviews (reviews at linguistlist.org):
Simin Karimi, U. of Arizona
Terence Langendoen, U. of Arizona
Consulting Editor:
Andrew Carnie, U. of Arizona <carnie at linguistlist.org>
Editors (linguist at linguistlist.org):
Karen Milligan, WSU Naomi Ogasawara, EMU
James Yuells, EMU Marie Klopfenstein, WSU
Michael Appleby, EMU Heather Taylor, EMU
Ljuba Veselinova, Stockholm U. Richard John Harvey, EMU
Dina Kapetangianni, EMU Renee Galvis, WSU
Karolina Owczarzak, EMU
Software: John Remmers, E. Michigan U. <remmers at emunix.emich.edu>
Gayathri Sriram, E. Michigan U. <gayatri at linguistlist.org>
Home Page: http://linguistlist.org/
The LINGUIST List is funded by Eastern Michigan University, Wayne
State University, and donations from subscribers and publishers.
Editor for this issue: Karolina Owczarzak <karolina at linguistlist.org>
=================================Directory=================================
1)
Date: Sat, 25 May 2002 02:07:18 +0000
From: jason at cs.jhu.edu
Subject: Computational Ling: Eisner "Smoothing a Probabilistic Lexicon..."
-------------------------------- Message 1 -------------------------------
Date: Sat, 25 May 2002 02:07:18 +0000
From: jason at cs.jhu.edu
Subject: Computational Ling: Eisner "Smoothing a Probabilistic Lexicon..."
New Dissertation Abstract
Institution: University of Pennsylvania
Program: Computer and Information Science
Dissertation Status: Completed
Degree Date: 2001
Author: Jason Michael Eisner
Dissertation Title:
Smoothing a Probabilistic Lexicon Via Syntactic Transformations
Dissertation URL: http://cs.jhu.edu/~jason/papers/#thesis01
Linguistic Field: Computational Linguistics
Dissertation Director 1: Mitchell P. Marcus
Dissertation Abstract:
Probabilistic parsing requires a lexicon that specifies each word's
syntactic preferences in terms of probabilities. To estimate these
probabilities for words that were poorly observed during training,
this thesis assumes the existence of arbitrarily powerful
transformations (also known to linguists as lexical redundancy rules
or metarules) that can add, delete, retype or reorder the argument and
adjunct positions specified by a lexical entry. In a given language,
some transformations apply frequently and others rarely. We describe
how to estimate the rates of the transformations from a sample of
lexical entries. More deeply, we learn which properties of a
transformation increase or decrease its rate in the language. As a
result, we can smooth the probabilities of lexical entries. Given
enough direct evidence about a lexical entry's probability, our
Bayesian approach trusts the evidence; but when less evidence or no
evidence is available, it relies more on the transformations' rates
to guess how often the entry will be derived from related
entries.
Abstractly, the proposed ``transformation models'' are probability
distributions that arise from graph random walks with a log-linear
parameterization. A domain expert constructs the parameterized graph,
and a vertex is likely according to whether random walks tend to halt
at it. Transformation models are suited to any domai where ``related''
events (as defined by the graph) may have positively covarying
probabilities. Such models admit a natural prior that favors simple
regular relationships over stipulative exceptions. The model
parameters can be locally optimized by gradient-based methods or by
Expectation-Maximization. Exact algorithms (matrix inversion) and
approximate ones (relaxation) are provided, with
optimizations. Variations on the idea are also discussed. We
compare the new technique empirically to previous techniques from the
probabilistic parsing literature, using comparable features, and
obtain a 20% perplexity reduction (similar to doubling the amount of
training data). Some of this reduction is shown to stem from the
transformation model's ability to match observed probabilities, and
some from its ability to generalize. Model averaging yields a final
24% perplexity reduction.
---------------------------------------------------------------------------
LINGUIST List: Vol-13-1535
More information about the LINGUIST
mailing list