Livre: Nie, Cross-Language Information Retrieval

Thierry Hamon thierry.hamon at UNIV-PARIS13.FR
Tue Jun 22 19:51:53 UTC 2010

Date: Mon, 21 Jun 2010 17:22:40 -0400
From: Graeme Hirst <gh at>
Message-Id: <0ABB5A4E-D5CE-47A5-8BB9-338A9F4A77EC at>


Cross-Language Information Retrieval

Jian-Yun Nie (University of Montreal)

Synthesis Lectures on Human Language Technologies #8 (Morgan &
Claypool Publishers), 2010, 125 pages


Search for information is no longer exclusively limited within the
native language of the user, but is more and more extended to other
languages. This gives rise to the problem of cross-language
information retrieval (CLIR), whose goal is to find relevant
information written in a different language to a query. In addition to
the problems of monolingual information retrieval (IR), translation is
the key problem in CLIR: one should translate either the query or the
documents from a language to another. However, this translation
problem is not identical to full-text machine translation (MT): the
goal is not to produce a human-readable translation, but a translation
suitable for finding relevant documents. Specific translation methods
are thus required.

The goal of this book is to provide a comprehensive description of the
specific problems arising in CLIR, the solutions proposed in this
area, as well as the remaining problems. The book starts with a
general description of the monolingual IR and CLIR problems. Different
classes of approaches to translation are then presented: approaches
using an MT system, dictionary-based translation and approaches based
on parallel and comparable corpora. In addition, the typical retrieval
effectiveness using different approaches is compared. It will be shown
that translation approaches specifically designed for CLIR can rival
and outperform high-quality MT systems. Finally, the book offers a
look into the future that draws a strong parallel between query
expansion in monolingual IR and query translation in CLIR, suggesting
that many approaches developed in monolingual IR can be adapted to

The book can be used as an introduction to CLIR. Advanced readers can
also find more technical details and discussions about the remaining
research challenges in the future. It is suitable to new researchers
who intend to carry out research on CLIR.

Table of Contents: Preface / Introduction / Using Manually Constructed
Translation Systems and Resources for CLIR / Translation Based on
Parallel and Comparable Corpora / Other Methods to Improve CLIR / A
Look into the Future: Toward a Unified View of Monolingual IR and
CLIR? / References

This title is available online without charge to members of
institutions that have licensed the Synthesis Digital Library of
Engineering and Computer Science.  Members of licensing institutions
have unlimited access to download, save, and print the PDF without
restriction; use of the book as a course text is encouraged.  To find
out whether your institution is a subscriber, visit, or just click on the
book's URL above from an institutional IP address and attempt to
download the PDF.  Others may purchase the book from this URL as a PDF
download for US$30 or in print for US$40.  Printed copies are also
available from Amazon and from booksellers worldwide at approximately
US$40 or local currency equivalent.

Message diffuse par la liste Langage Naturel <LN at>
Informations, abonnement :
English version       : 
Archives                 :

La liste LN est parrainee par l'ATALA (Association pour le Traitement
Automatique des Langues)
Information et adhesion  :

More information about the Ln mailing list