[Corpora-List] New Book: Zhai: Statistical Language Models for Information Retrieval

Tue Dec 1 20:15:29 UTC 2009

BOOK ANNOUNCEMENT

Statistical Language Models for Information Retrieval

ChengXiang Zhai (University of Illinois, Urbana-Champaign)

Synthesis Lectures on Human Language Technologies #1 (Morgan &  
Claypool Publishers), 2009, 141 pages

As online information grows dramatically, search engines such as  
Google are playing a more and more important role in our lives.  
Critical to all search engines is the problem of designing an  
effective retrieval model that can rank documents accurately for a  
given query. This has been a central research problem in information  
retrieval for several decades. In the past ten years, a new generation  
of retrieval models, often referred to as statistical language models,  
has been successfully applied to solve many different information  
retrieval problems. Compared with the traditional models such as the  
vector space model, these new models have a more sound statistical  
foundation and can leverage statistical estimation to optimize  
retrieval parameters. They can also be more easily adapted to model  
non-traditional and complex retrieval problems. Empirically, they tend  
to achieve comparable or better performance than a traditional model  
with less effort on parameter tuning. This book systematically reviews  
the large body of literature on applying statistical language models  
to information retrieval with an emphasis on the underlying  
principles, empirically effective language models, and language models  
developed for non-traditional retrieval tasks. All the relevant  
literature has been synthesized to make it easy for a reader to digest  
the research progress achieved so far and see the frontier of research  
in this area. The book also offers practitioners an informative  
introduction to a set of practically useful language models that can  
effectively solve a variety of retrieval problems. No prior knowledge  
about information retrieval is required, but some basic knowledge  
about probability and statistics would be useful for fully digesting  
all the details.

Table of Contents: Introduction / Overview of Information Retrieval  
Models / Simple Query Likelihood Retrieval Model / Complex Query  
Likelihood Model / Probabilistic Distance Retrieval Model / Language  
Models for Special Retrieval Tasks / Language Models for Latent Topic  
Analysis / Conclusions

http://dx.doi.org/10.2200/S00158ED1V01Y200811HLT001

This title is available online without charge to members of  
institutions that that have licensed the Synthesis Digital Library of  
Engineering and Computer Science.  Members of licensing institutions  
have unlimited access to download, save, and print the PDF without  
restriction; use of the book as a course text is encouraged.  To find  
out whether your institution is a subscriber, visit <http://www.morganclaypool.com/page/licensed 
 >, or just click on the book's URL above from an institutional IP  
address and attempt to download the PDF.  Others may purchase the book  
from this URL as a PDF download for US$30 or in print for US$40.   
Printed copies are also available from Amazon and from booksellers  
worldwide at approximately US$40 or local currency equivalent.

_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora