[Corpora-List] New book: Tiedemann: Bitext Alignment

Graeme Hirst gh at cs.toronto.edu
Fri Aug 12 15:34:10 UTC 2011


NEW BOOK

Bitext Alignment
Jörg Tiedemann
2011
PDF (3997 KB) | PDF Plus (2585 KB) 

Abstract
This book provides an overview of various techniques for the alignment of bitexts. It describes general concepts and strategies that can be applied to map corresponding parts in parallel documents on various levels of granularity. Bitexts are valuable linguistic resources for many different research fields and practical applications. The most predominant application is machine translation, in particular, statistical machine translation. However, there are various other threads that can be followed which may be supported by the rich linguistic knowledge implicitly stored in parallel resources. Bitexts have been explored in lexicography, word sense disambiguation, terminology extraction, computer-aided language learning and translation studies to name just a few. The book covers the essential tasks that have to be carried out when building parallel corpora starting from the collection of translated documents up to sub-sentential alignments. In particular, it describes various approaches to document alignment, sentence alignment, word alignment and tree structure alignment. It also includes a list of resources and a comprehensive review of the literature on alignment techniques.

Table of Contents: Introduction / Basic Concepts and Terminology / Building Parallel Corpora / Sentence Alignment / Word Alignment / Phrase and Tree Alignment / Concluding Remarks

This title is available online free of charge to members of institutions that that have licensed the content through the Synthesis Digital Library of Engineering and Computer Science or the Synthesis Lectures on Human Language Technologies. 

Use of this book as a course text is encouraged; and the text may be downloaded without restriction at licensing institutions, or after a one-time fee of $30 USD at non-licensing schools. To find out whether your institution is a subscriber, visit <http://www.morganclaypool.com/page/licensed>, or follow the links above and attempt to download the PDF. Additional information about Synthesis can be found through the following links, or by contacting me directly. 
Available titles and subject areas: http://www.morganclaypool.com/page/ForthcomingSynthesisLectures
Information for librarians, including pricing and license: http://www.morganclaypool.com/page/librarian_info
A review of Synthesis in ISTL: http://www.istl.org/09-winter/electronic.html

This book can also be purchased in print directly from the Morgan & Claypool Bookstore for $45.00 USD, from Amazon.com, and other booksellers worldwide.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20110812/fc868def/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list