29.2841, Books: Data-driven Machine Translation using Semantic Tree Alignment: Vanallemeersch

The LINGUIST List linguist at listserv.linguistlist.org
Mon Jul 9 12:35:57 EDT 2018

LINGUIST List: Vol-29-2841. Mon Jul 09 2018. ISSN: 1069 - 4875.

Subject: 29.2841, Books: Data-driven Machine Translation using Semantic Tree Alignment: Vanallemeersch

Moderators: linguist at linguistlist.org (Damir Cavar, Malgorzata E. Cavar)
Reviews: reviews at linguistlist.org (Helen Aristar-Dry, Robert Coté)
Homepage: https://linguistlist.org

Please support the LL editors and operation with a donation at:

Editor for this issue: Jeremy Coburn <jecoburn at linguistlist.org>

Date: Mon, 09 Jul 2018 12:35:46
From: Karijn Hootsen [gw.uilots.lot at uu.nl]
Subject: Data-driven Machine Translation using Semantic Tree Alignment: Vanallemeersch


Title: Data-driven Machine Translation using Semantic Tree Alignment 
Series Title: LOT Dissertation Series  

Publication Year: 2018 
Publisher: Netherlands Graduate School of Linguistics / Landelijke (LOT)

Book URL: https://www.lotpublications.nl/data-driven-machine-translation-using-semantic-tree-alignment 

Author: Tom Vanallemeersch

Paperback: ISBN:  9789460932755 Pages: 235 Price: Europe EURO 32.00


This dissertation deals with the improvement of systems for machine
translation (MT) using semantic information. Such information tends to remain
constant during translation, while the syntactic structure of sentences often
changes, as a result of linguistic necessities or translators' choices. These
changes make it difficult to derive syntactic rules automatically when
building a statistical MT system (a type of data-driven system) using a
substantial amount of sentences and their translation. For instance, the verb
in a subordinaute clause must be moved after the direct object when
translating from English to Dutch. Another example relates to the verb like:
when translating it to bevallen ('please') in Dutch, the direct object becomes
the subject. Constructing a syntax-based statistical MT system involves the
automated alignment of words, the creation of a phrase table with the
translation of words and word groups, and the derivation of translation rules
based on syntactic trees produced by a parser.

In this dissertation, we investigate whether a semantic analysis of sentences
and their translation facilitates the creation of translation rules and
improves the quality of rules. We focus on shallow semantics, in the form of
predicates and roles, and experiment with a four-step approach which requires
a minimum amount of manual intervention.  The first step consists of enriching
parse trees with predicate and role labels. As tools which perform such
labeling are scarce, we design a method which supports the creation of a new
tool on the basis of semantic information in another language. This method
makes use of word alignment and creates mappings between syntax and semantics.
The second step consists of aligning parse trees via semantic labels. The
third step consists of deriving translation rules based on semantic alignment.
The final step extends a statistical MT system with semantic translation

We implemented our four-step approach in order to evaluate it. The results
indicate that enriching parse trees with semantic predicate and role labels
leads to more precise tree alignment results, and that combining a phrase
table with semantic translation rules helps in improving translation quality.
While we perform tests on the language pair English-to-Dutch, our approach is
sufficiently generic for tests on other language pairs and for contexts other
than MT. For instance, it can be applied for detecting specific structures in
aligned parse trees in the context of translation studies.

Linguistic Field(s): Computational Linguistics

Written In: English  (eng)

See this book announcement on our website: 


*****************    LINGUIST List Support    *****************
Please support the LL editors and operation with a donation at:

              The IU Foundation Crowd Funding site:

               The LINGUIST List FundDrive Page:

LINGUIST List: Vol-29-2841	

More information about the LINGUIST mailing list