Livre: Tiedemann, Bitext Alignment

Thierry Hamon thierry.hamon at UNIV-PARIS13.FR
Fri Aug 12 18:54:37 UTC 2011

Date: Fri, 12 Aug 2011 11:34:12 -0400
From: Graeme Hirst <gh at>
Message-Id: <05BB6EBA-B0CB-46CB-A964-32BCB3552E4D at>


Bitext Alignment
Jörg Tiedemann
PDF (3997 KB) | PDF Plus (2585 KB) 

This book provides an overview of various techniques for the alignment
of bitexts. It describes general concepts and strategies that can be
applied to map corresponding parts in parallel documents on various
levels of granularity. Bitexts are valuable linguistic resources for
many different research fields and practical applications. The most
predominant application is machine translation, in particular,
statistical machine translation. However, there are various other
threads that can be followed which may be supported by the rich
linguistic knowledge implicitly stored in parallel resources. Bitexts
have been explored in lexicography, word sense disambiguation,
terminology extraction, computer-aided language learning and translation
studies to name just a few. The book covers the essential tasks that
have to be carried out when building parallel corpora starting from the
collection of translated documents up to sub-sentential alignments. In
particular, it describes various approaches to document alignment,
sentence alignment, word alignment and tree structure alignment. It also
includes a list of resources and a comprehensive review of the
literature on alignment techniques.

Table of Contents: Introduction / Basic Concepts and Terminology /
Building Parallel Corpora / Sentence Alignment / Word Alignment / Phrase
and Tree Alignment / Concluding Remarks

This title is available online free of charge to members of institutions
that that have licensed the content through the Synthesis Digital
Library of Engineering and Computer Science or the Synthesis Lectures on
Human Language Technologies.

Use of this book as a course text is encouraged; and the text may be
downloaded without restriction at licensing institutions, or after a
one-time fee of $30 USD at non-licensing schools. To find out whether
your institution is a subscriber, visit
<>, or follow the links above
and attempt to download the PDF. Additional information about Synthesis
can be found through the following links, or by contacting me directly. 

Available titles and subject areas:

Information for librarians, including pricing and license:

A review of Synthesis in ISTL:

This book can also be purchased in print directly from the Morgan &
Claypool Bookstore for $45.00 USD, from, and other
booksellers worldwide.

Message diffuse par la liste Langage Naturel <LN at>
Informations, abonnement :
English version       : 
Archives                 :

La liste LN est parrainee par l'ATALA (Association pour le Traitement
Automatique des Langues)
Information et adhesion  :

More information about the Ln mailing list