14.17, Diss: Computational Ling: Davis "Stone Soup..."

LINGUIST List linguist at linguistlist.org
Tue Jan 7 18:26:18 UTC 2003


LINGUIST List:  Vol-14-17. Tue Jan 7 2003. ISSN: 1068-4875.

Subject: 14.17, Diss: Computational Ling: Davis "Stone Soup..."

Moderators: Anthony Aristar, Wayne State U.<aristar at linguistlist.org>
            Helen Dry, Eastern Michigan U. <hdry at linguistlist.org>

Reviews (reviews at linguistlist.org):
	Simin Karimi, U. of Arizona
	Terence Langendoen, U. of Arizona

Home Page:  http://linguistlist.org/

The LINGUIST List is funded by Eastern Michigan University, Wayne
State University, and donations from subscribers and publishers.



Editor for this issue: Karolina Owczarzak <karolina at linguistlist.org>

=================================Directory=================================

1)
Date:  Wed, 01 Jan 2003 15:05:17 +0000
From:  pcdavis at julius.ling.ohio-state.edu
Subject:  Computational Ling: Davis "Stone Soup Translation..."

-------------------------------- Message 1 -------------------------------

Date:  Wed, 01 Jan 2003 15:05:17 +0000
From:  pcdavis at julius.ling.ohio-state.edu
Subject:  Computational Ling: Davis "Stone Soup Translation..."



New Dissertation Abstract

Institution: Ohio State University
Program: Department of Linguistics
Dissertation Status: Completed
Degree Date: 2002

Author: Paul C. Davis

Dissertation Title:
Stone Soup Translation: The Linked Automata Model

Dissertation URL: http://www.ling.ohio-state.edu/~pcdavis/papers/diss.html

Linguistic Field: Computational Linguistics

Dissertation Director 1: Chris Brew
Dissertation Director 2: Detmar Meurers
Dissertation Director 3: Robert Kasper
Dissertation Director 4: Erhard Hinrichs


Dissertation Abstract:

The automated translation of one natural language to another, known as
machine translation (MT), typically requires successful modeling of
the grammars of the languages and the relationship between
them. Rather than hand-coding these grammars and relationships, some
machine translation efforts employ data-driven methods, where the goal
is to learn from a large amount of training examples of accurate
translations. One such data-driven approach is statistical MT, where
language and alignment models are automatically induced from parallel
corpora. This work has also been extended to probabilistic
finite-state approaches, most often via transducers.

This dissertation introduces and begins an investigation of an MT
model consisting of a novel combination finite-state devices. The
model proposed is more flexible than transducer models, giving
increased ability to handle word order differences between languages,
as well as crossing and discontinuous alignments between words. The
linked automata MT model consists of a source language automaton, a
target language automaton, and an alignment table---a function which
probabilistically links sequences of source and target language
transitions. It is this augmentation to the finite-state base which
gives the linked automata model its flexibility.

The dissertation describes the linked automata model from the ground
up, beginning with a description of some of the relevant MT history
and empirical MT literature, and the preparatory steps for building
the model, including a detailed discussion of word alignment and the
introduction of a new technique for word alignment
evaluation. Discussion then centers on the description of the model
and its use of probabilities, including algorithms for its
construction from word-aligned bitexts and for the translation
process. The focus next moves to expanding the linked automata
approach, first through generalization and techniques for extracting
partial results, and then by increasing the coverage, both in terms of
using additional linguistic information and using more complex
alignments. The dissertation presents preliminary results for a test
corpus of English to Spanish translations, and suggests ways in which
the model can be further expanded as the foundation of a more powerful
MT system.

---------------------------------------------------------------------------
LINGUIST List: Vol-14-17



More information about the LINGUIST mailing list