13.875, Diss: Computational Ling: Bond "Determiners..."
LINGUIST List
linguist at linguistlist.org
Fri Mar 29 19:43:52 UTC 2002
LINGUIST List: Vol-13-875. Fri Mar 29 2002. ISSN: 1068-4875.
Subject: 13.875, Diss: Computational Ling: Bond "Determiners..."
Moderators: Anthony Aristar, Wayne State U.<aristar at linguistlist.org>
Helen Dry, Eastern Michigan U. <hdry at linguistlist.org>
Andrew Carnie, U. of Arizona <carnie at linguistlist.org>
Reviews (reviews at linguistlist.org):
Simin Karimi, U. of Arizona
Terence Langendoen, U. of Arizona
Editors (linguist at linguistlist.org):
Karen Milligan, WSU Naomi Ogasawara, EMU
James Yuells, EMU Marie Klopfenstein, WSU
Michael Appleby, EMU Heather Taylor-Loring, EMU
Ljuba Veselinova, Stockholm U. Richard John Harvey, EMU
Dina Kapetangianni, EMU Renee Galvis, WSU
Karolina Owczarzak, EMU
Software: John Remmers, E. Michigan U. <remmers at emunix.emich.edu>
Gayathri Sriram, E. Michigan U. <gayatri at linguistlist.org>
Home Page: http://linguistlist.org/
The LINGUIST List is funded by Eastern Michigan University, Wayne
State University, and donations from subscribers and publishers.
Editor for this issue: Karolina Owczarzak <karolina at linguistlist.org>
=================================Directory=================================
1)
Date: Fri, 29 Mar 2002 00:58:18 +0000
From: bond at cslab.kecl.ntt.co.jp
Subject: Computational Ling: Bond "Determiners and Number in English..."
-------------------------------- Message 1 -------------------------------
Date: Fri, 29 Mar 2002 00:58:18 +0000
From: bond at cslab.kecl.ntt.co.jp
Subject: Computational Ling: Bond "Determiners and Number in English..."
New Dissertation Abstract
Institution: University of Queensland
Program: Department of English
Dissertation Status: Completed
Degree Date: 2001
Author: Francis Charles Bond
Dissertation Title:
Determiners and Number in English contrasted with Japanese, as
exemplified in Machine Translation
Dissertation URL:
http://www.kecl.ntt.co.jp/icl/mtg/members/bond/pubs/2001-phd.html
Linguistic Field: Translation, Computational Linguistics
Subject Language: Japanese, English
Dissertation Director 1: Roland Sussex
Dissertation Director 2: Rodney Huddleston
Dissertation Abstract:
The fact that concepts are grammaticalized differently in different
languages is a major problem for translation, especially for machine
translation. Two major examples of this are syntactic number, and the
use of (in)definite articles (a, some, the). In languages such as
English, nouns are marked for number and the choice of article (or of
no article) must be made for every noun phrase. In contrast, for
languages such as Japanese, number distinctions are not normally made,
and there are no articles. This means that whenever a noun phrase is
translated from Japanese to English, even if the denotation is
perfectly understood and a good translation equivalent found,
generating the noun phrase still requires two difficult choices:
should the head noun be singular or plural, and which article, if any,
should be generated.
This thesis proposes a semantic representation and a series of three
heuristic algorithms that make possible the appropriate generation of
articles and number when translating from Japanese to English. The
semantic representation provides a tractable set of features to
represent (1) the referential use of a noun phrase, as either
referential, generic, ascriptive or idiomatic; (2) the interpretation
of the noun phrase's referent as either a countable individual or a
mass, with seven detailed subtypes; (3) the definiteness of the noun
phrase, as either definite, indefinite, definite and extensive, or
possessed. The three algorithms automatically acquire values for these
features from the analysis of the Japanese text and the lexical
properties of the English translation equivalents, and then use them
to generate English. The first algorithm determines the referential
use of Japanese noun phrases, based on a defeasible hierarchy of
pragmatic rules that are applied top-down, from the clause to the noun
phrase. The second algorithm determines the appropriate interpretation
for English noun phrases, while the third determines which determiner,
if any, should be generated. These algorithms use rules based on the
different referential uses of the noun phrase.
The proposed algorithms are implemented in a Japanese-to-English
machine translation system, and the detailed lexical information is
entered into its lexicon. The use of the algorithms improves the
percentage of noun phrases generated with correct use of articles and
number from 65% to 85%.
---------------------------------------------------------------------------
LINGUIST List: Vol-13-875
More information about the LINGUIST
mailing list