Livre: Linguistic Annotation and Text Analytics

Fri Dec 11 21:10:33 UTC 2009

Date: Wed, 9 Dec 2009 15:49:47 -0500
From: Graeme Hirst <gh at cs.toronto.edu>
Message-Id: <95A7CD24-33C0-4527-A26E-E1F36D9E919F at cs.toronto.edu>
X-url: http://sites.morganclaypool.com/wilcock.
X-url: http://dx.doi.org/10.2200/S00194ED1V01Y200905HLT003
X-url: http://www.morganclaypool.com/page/licensed

BOOK ANNOUNCEMENT

Introduction to Linguistic Annotation and Text Analytics

Graham Wilcock (University of Helsinki)

Synthesis Lectures on Human Language Technologies #3 (Morgan &  
Claypool Publishers), 2009, 159 pages

Linguistic annotation and text analytics are active areas of research  
and development, with academic conferences and industry events such as  
the Linguistic Annotation Workshops and the annual Text Analytics  
Summits. This book provides a basic introduction to both fields, and  
aims to show that good linguistic annotations are the essential  
foundation for good text analytics. After briefly reviewing the basics  
of XML, with practical exercises illustrating in-line and stand-off  
annotations, a chapter is devoted to explaining the different levels  
of linguistic annotations. The reader is encouraged to create example  
annotations using the WordFreak linguistic annotation tool. The next  
chapter shows how annotations can be created automatically using  
statistical NLP tools, and compares two sets of tools, the OpenNLP and  
Stanford NLP tools. The second half of the book describes different  
annotation formats and gives practical examples of how to interchange  
annotations between different formats using XSLT transformations. The  
two main text analytics architectures, GATE and UIMA, are then  
described and compared, with practical exercises showing how to  
configure and customize them. The final chapter is an introduction to  
text analytics, describing the main applications and functions  
including named entity recognition, coreference resolution and  
information extraction, with practical examples using both open source  
and commercial tools. Copies of the example files, scripts, and  
stylesheets used in the book are available from the companion website,  
located at http://sites.morganclaypool.com/wilcock.

Table of Contents: Working with XML / Linguistic Annotation / Using  
Statistical NLP Tools / Annotation Interchange / Annotation  
Architectures / Text Analytics

http://dx.doi.org/10.2200/S00194ED1V01Y200905HLT003

This title is available online without charge to members of
institutions that have licensed the Synthesis Digital Library of
Engineering and Computer Science.  Members of licensing institutions
have unlimited access to download, save, and print the PDF without
restriction; use of the book as a course text is encouraged.  To find
out whether your institution is a subscriber, visit
http://www.morganclaypool.com/page/licensed , or just click on the
book's URL above from an institutional IP address and attempt to
download the PDF.  Others may purchase the book from this URL as a PDF
download for US$30 or in print for US$40.  Printed copies are also
available from Amazon and from booksellers worldwide at approximately
US$40 or local currency equivalent.
-------------------------------------------------------------------------
Message diffuse par la liste Langage Naturel <LN at cines.fr>
Informations, abonnement : http://www.atala.org/article.php3?id_article=48
English version       : 
Archives                 : http://listserv.linguistlist.org/archives/ln.html
                                http://liste.cines.fr/info/ln

La liste LN est parrainee par l'ATALA (Association pour le Traitement
Automatique des Langues)
Information et adhesion  : http://www.atala.org/
-------------------------------------------------------------------------