25.3094, Review: Computational Linguistics

Wed Jul 30 21:55:34 UTC 2014

LINGUIST List: Vol-25-3094. Wed Jul 30 2014. ISSN: 1069 - 4875.

Subject: 25.3094, Review: Computational Linguistics

Moderators: Damir Cavar, Eastern Michigan U <damir at linguistlist.org>
            Malgorzata E. Cavar, Eastern Michigan U <gosia at linguistlist.org>

Reviews: reviews at linguistlist.org
Anthony Aristar <aristar at linguistlist.org>
Helen Aristar-Dry <hdry at linguistlist.org>
Mateja Schuck, U of Wisconsin Madison

Homepage: http://linguistlist.org

Do you want to donate to LINGUIST without spending an extra penny? Bookmark
the Amazon link for your country below; then use it whenever you buy from
Amazon!

USA: http://www.amazon.com/?_encoding=UTF8&tag=linguistlist-20
Britain: http://www.amazon.co.uk/?_encoding=UTF8&tag=linguistlist-21
Germany: http://www.amazon.de/?_encoding=UTF8&tag=linguistlistd-21
Japan: http://www.amazon.co.jp/?_encoding=UTF8&tag=linguistlist-22
Canada: http://www.amazon.ca/?_encoding=UTF8&tag=linguistlistc-20
France: http://www.amazon.fr/?_encoding=UTF8&tag=linguistlistf-21

For more information on the LINGUIST Amazon store please visit our
FAQ at http://linguistlist.org/amazon-faq.cfm.

Editor for this issue: Malgorzata Cavar <gosia at linguistlist.org>
================================================================  

Date: Wed, 30 Jul 2014 17:54:39
From: Mauro Costantino [costantino.mauro at gmail.com]
Subject: The Handbook of Computational Linguistics and Natural Language Processing

E-mail this message to a friend:
http://linguistlist.org/issues/emailmessage/verification.cfm?iss=25-3094.html&submissionid=26251358&topicid=9&msgnumber=1

Discuss this message: 
http://linguistlist.org/pubs/reviews/get-review.cfm?subid=26251358

Book announced at http://linguistlist.org/issues/24/24-284.html

EDITOR: Alexander  Clark
EDITOR: Chris  Fox
EDITOR: Shalom  Lappin
TITLE: The Handbook of Computational Linguistics and Natural Language Processing
SERIES TITLE: Blackwell Handbooks in Linguistics
PUBLISHER: Wiley-Blackwell
YEAR: 2012

REVIEWER: Mauro Costantino, Universidad Mayor de San Andrés

SUMMARY

“The Handbook of Computational Linguistics and Natural Language Processing,”
edited by Alexander Clark, Chris Fox and Shalom Lappin, is a large collection
of 22 works that covers the field of Computational Linguistics (CL) and
Natural Language Processing (NLP), ranging from the theoretical aspects
(formal language theory, language models, among others) to the most concrete
applications (machine translation, question answering). The coverage is so
broad that the work can be considered a fundamental volume collecting a
comprehensive view of applications, methodologies and base theories in the
field of CL and NLP.

For the same reason, it can serve as a reference manual for a wide audience,
even without assuming all readers to be interested in or specialists in all
the different aspects; the different chapters give the theoretical basis, the
historical background and the overview of the state of the art on each of the
topics. In what follows, the organization of the handbook is detailed chapter
by chapter.

In the introduction, the structure of the manual is presented in order to
offer to the reader a clear map (section by section and chapter by chapter) of
the specific contents.  The chapter ends by presenting the goals and aims of
the manual, and explaining the reasons for its particular organization, the
choice of topics and the development of each chapter.

Chapter 1: “Formal Language Theory” (by Shuly Wintner).

The chapter starts with a basic introduction to formal language, without
assuming familiarity with the topic; it is nonetheless advisable, in order to
follow it, to have some familiarity with basic features of the language and
the notational methods and operations (of mathematical and logical nature).
The chapter then goes on with Regular Languages, Finite State Automata,
Transducers, Context Free Languages and the Chomsky Hierarchy.

Chapter 2: “Computational Complexity in Natural Language” (by Ian
Pratt-Hartmann).

Pratt-Hartmann starts with a review of Complexity theory, stating its goals
and presenting the basic methodology. A solid knowledge in mathematics and
logic is advisable, even though the chapter provides a step-by-step
introduction to the topic. Turing machines, decision problems, parsing and
recognition, and semantics are presented through the analysis of theorems and
definitions, each one with detailed examples.

Chapter 3: “Statistical Language Modeling” (by Ciprian Chelba).

The third chapter starts with the very beginning steps of Language Modeling
(LM), presenting the chain rule and n-grams, and then discussing perplexity,
in order to smoothly follows towards the Structured Language Model and its
applications in Speech Recognition.

Chapter 4: “Theory of Parsing” (by Mark-Jan Nederhof and Giorgio Satta).

The fourth chapter presents the theoretical bases of Parsing, phrase structure
and dependency structure, Probabilistic Parsing and (Lexicalized) Context Free
Grammars, leading into a discussion, detailed and rich in examples, of some
basics of the most common applications like Translation.

Chapter 5: “Maximum Entropy Models” (by Robert Malouf).

Robert Malouf presents, before entering into a discussion about practical
applications, the theoretical basis for the Maximum Entropy Model (MaxEnt).
The chapter deals with the theoretical development of MaxEnt, moving from
Shannon through probabilities in order to bring the reader to the
applications: Parameter Estimation, Regularization, Classification and
Parsing, among others.

Chapter 6: “Memory-Based Learning” (by Walter Daelemans and Antal van den
Bosch).

Memory-Based Learning (MBL) is presented alongside other methods (MaxEnt,
Decision Trees, Artificial Neural Networks) for supervised
classification-based learning in the sixth chapter. The work follows up with
the discussion of some NLP applications like Morpho-phonology,
Syntacto-semantics, Text analysis, Translation, and Computational
Psycholinguistics.

Chapter 7: “Decision Trees” (by Helmut Schmid).

Schmid presents another method for annotating linguistic entities through
classification: decision trees. The chapter explains through examples how
Decision Trees are inducted from training data, and moves then to the
applications, like Grapheme-to-morpheme conversion, and POS-tagging. The
chapter ends with a discussion about advantages and disadvantages of Decision
Trees.

Chapter 8: “Unsupervised Learning and Grammar Induction” (by Alexander Clark
and Shalom Lappin).

This chapter addresses two main aspects of Unsupervised Learning: the
advantages and disadvantages of unsupervised learning applications to large
corpora, and the possible relevance of unsupervised learning for the debate
about the cognitive basis of human language acquisition.  The topics are
presented in an accurate manner, discussing the comparison between supervised
and unsupervised learning. Examples in classification tasks and parsing are
presented. The last section of the first part of the chapter compares
supervised, unsupervised, and semi-supervised learning, taking into account
the “accuracy vs. cost” dichotomy, and also discussing the possibilities for
future developments of the field.

The second part of the chapter, discussing the new insights that unsupervised
learning has brought to human language acquisition studies, presents a broad
vision of the state of the art in human language acquisition.

Chapter 9: “Artificial Neural Networks” (by James B. Henderson).

The chapter starts with an introductory background section that presents
Artificial Neural Networks (ANN) and Multi-Layered Perceptrons (MLP), the most
commonly-used type of ANN in NLP, and statistical modeling. It then moves on
to contemporary research in NLP like the improvement of large n-grams, parsing
(constituency, dependency, functional and semantic role parsing), and tagging,
discussing advantages and disadvantages of ANN and SLM.

Chapter 10: “Linguistic Annotation” (by Martha Palmer and Nianwen Xue).

This chapter presents Linguistic Annotation starting from the early times of
the Penn Treebank and the Semcor, to the British National Corpus, and to
present-day work in annotation; the discussion also touches on different
schemes, presenting “a representative set of widely used resources” (p. 239)
such as Syntactic structure, Independent semantic classification, Semantic
relation labeling, Discourse relation, Temporal relation, Coreference, and
Opinion tagging.
The second part of the chapter deals with the annotation process, analyzing it
step by step from the choice of the target corpus to the study of efficiency
and consistency of annotation, to the presentation of the possible
infrastructures and the available tools, and concluding with evaluation and
pre-processing.

Chapter 11: “Evaluation of NLP Systems” (by Philip Resnik and Jimmy Lin).

The chapter starts with a broad discussion presenting some fundamental
concepts of NLP systems (Automatic/manual evaluation, Formative/summative
evaluation, Intrinsic/extrinsic evaluation, Component/end-to-end evaluation,
Inter-annotator agreement and upper bounds), then moving on to discussing the
partitioning of data and cross-validation advantages, eventually closing the
section with a summary of the evaluation metrics and comparison of their
performance. The following part of the chapter offers an introduction to the
three NLP evaluation categories (one possible correct output, various outputs
possible, scalable values outputs). The chapter ends with two case studies,
both well explained and detailed, that give the reader a quick and concrete
reference for the previously explained theory.

Chapter 12: “Speech Recognition” (by Steve Renals and Thomas Hain).

The chapter deals with Automatic Speech Transcription, starting from
statistical frameworks and the usage of corpora for the development and
evaluation of the algorithm. After the statistical section, the authors focus
on the Acoustic generative modeling of p(X|W) and approach modeling through
Hidden Markov Models. The last section deals with the decoding issue (search)
and the maximization of the computed probability through the Viterbi
algorithm. The chapter ends with the analysis of a case study and the study of
the performances of preset day systems, their strengths and their weaknesses.

Chapter 13: “Statistical Parsing” (by Stephen Clark).

The chapter starts by introducing some baseline questions about the grammar,
the algorithm, the model and the choice of the best parses from a theoretical
point of view. It then presents an historical review of the topic (beginning
with the very first attempts in Sampson 1986, down to present-day works).

The author focuses next on Generative (with special attention to Collins
models) and Discriminative parsing models. The author then analyzes in detail
Transition based approaches presenting various examples in the literature, and
concludes the study of Statistical Parsing with Combinatory Categorial
Grammar.

Chapter 14: “Segmentation and Morphology” (by John A. Goldsmith).

Goldsmith starts by presenting the basic definition of morphophonology,
morphosyntax, and morphological decomposition as a brief overview. The chapter
goes on with more technical NLP insights, discussing Unsupervised Learning of
Words and “four major approaches” (p. 373), namely Olivier, MK10, Sequitur and
MDL. The following section presents Unsupervised Learning of Morphology from
the beginning of the studies in the 1950s with Zellig Harris to present-day
works. The chapter ends with a discussion about the Implementation of
Computational Morphologies, the usage of Finite Stage Transducers and the case
of morphophonology.

Chapter 15: “Computational Semantics” (by Chris Fox).

The chapter, after stating the difference between formal semantics and
computational semantics, moves on to formal theory and logical grammar, in
order to present background on the computability of semantics and different
approaches. The author goes on by presenting the state of the art as
propaedeutical material for the next section about research issues such as
intentionality, non-indicatives, and expressiveness, among others. The chapter
ends with a less theoretical topic, namely corpus-based and Machine learning
methods in computational semantics, thus putting some distance between the
more classical strictly formal logic approach and computational semantics.

Chapter 16: “Computational Models of Dialogue” (by Jonathan Ginzburg and
Raquel Fernández).

This chapter starts with discussion of the basics characteristics of dialogue
and peculiarities from the point of view of structure, in order to define the
methodological challenges of computational modeling of dialogue. Once the
theoretical questions are settled, the author presents approaches to Dialog
System Design and evaluation through comparison (query and assertion,
meta-communication, fragment understanding benchmarks). The second part of the
chapter is dedicated to Interaction and Meaning (Coherence, Cohesion,
Illocutionary interaction, query and assertion, etc.) and to the models for
automatic learning of dialogue management (based on Markovian Decision
Processes). It presents “the underlying logical framework [...] [that]
provides the formalism to build a semantic ontology and write conversational
and grammar rules” (p. 453). The chapter ends with “Extensions”, offering
suggestions for further development of the topics treated that could not find
space in the manual.

Chapter 17: “Computational Psycholinguistics” (by Matthew W. Crocker).

The chapter presents, at the beginning, an introduction to the topic as a
manner of establishing the reach and limitation of the very term
“computational psycholinguistics”, in order to specify the basis for the
entire chapter. A discussion of Symbolic Models follows, starting from the
first examples in the 1980s of computational parsing models, then continuing
into a section dealing with Probabilistic Models (touching lexical and
semantic ambiguity, syntactic processing, and disambiguation issues, among
others). The Sentence Processing section presents the application of
Artificial Neural Networks (here called Connectionist networks), discussing
advantages and criticism and following into presenting Hybrid Models.

Chapter 18: “Information Extraction” (by Ralph Grishman).

The first chapter among the “Applications” part deals with Information
Extraction (IE), and, after a short historical overview, presents its four
main tasks: name extraction, entity extraction, relation extraction, and event
extraction. For each one of the four sections, the discussion starts from the
analysis of some of the first approaches to IE with hand-written rules and
with Named Entity tagged corpora for supervised learning, and reaches the
presentation of the state-of-the-art methodological approaches in IE.

Chapter 19: “Machine Translation” (by Andy Way).

As stated in the introductory remarks, the chapter is divided into two parts,
one presenting the “state of the art in Machine Translation (MT)” and the
second presenting research in hybrid MT (p. 531). The first part jumps, in
fact, directly into current MT, avoiding the historical background, and
directly addressing the Phrase-Based Statistical Machine Translation (PB-SMT),
thus presenting all the steps for the development of a corpus-based system
(pre-processing data, clean-up, segmentation, tokenization, word/phrase
alignment, language models, decoding, among others). This thorough section
ends by discussing various approaches to evaluation in MT. The next section
discusses some of the currently developed (or under development) alternatives
to PB-SMT, such as Hierarchical Models, Tree-based Models, Example-based MT,
Rule-based MT and hybrid methods. The second part of the chapter details
research at Dublin City University (DCU) in the field of MT, presenting work
done in many directions, combining syntax-driven SMT, hybrid statistical and
EBMT, tree-based MT, rule based and much more.

Chapter 20: “Natural Language Generation” (by Ehud Reiter).

The twentieth chapter starts with a brief introduction on Natural Language
Generation (NLG) and choice making. The subsequent section discusses the
problem through the analysis of two NLG systems: SunTime and SkillSum. After a
review of some other alternatives to these two, the chapter continues by
analyzing the task of NLG into its basic steps: document planning (choice
making issues), microplanning (lexical choice, reference, syntactic choice,
aggregation) and realization. The chapter ends with a detailed discussion
about evaluation for NLG systems and some overview of currently
under-development research topics (statistical NLG, affective NLG). The
closing section lists some of the resources available in NLG such as software,
data resources and further readings.

Chapter 21: “Discourse Processing” (by Ruslan Mitkov).

Mitkov starts with a practical approach to the basic notion of discourse, by
presenting an example-based discussion of the coherence-cohesion dichotomy and
the different types of discourse. The second section deals with Discourse
Structure: organization and segmentation algorithm (TextTiling). The
subsequent part of the chapter goes into details, analyzing Hobbs' theory of
coherence, Mann and Thompson's Rhetorical Structure Theory (Mann & Thompson,
1988) and Centering (Grosz et al. 1995). The fourth section deals with
anaphora resolution, starting from the basic definition of anaphora and
reference, then moving to the computational problem of anaphora resolution and
the related algorithm (full parsing, partial parsing and their comparison).
The chapter ends with a panoramic view of applications in discourse processing
(in discourse segmentation, discourse coherence and anaphora resolution). A
“further reading” section closes the chapter with a rich presentation of
interesting possible amplification and development both from the statistical
approach point of view and from the corpus-based approach.

Chapter 22: “Question Answering” (by Bonnie Webber and Nick Webb)

The authors start with a review of Question Answering (QA) systems from their
first steps until state-of-the-art implementations. The discussion analyzes
the different steps of question typing, query construction, text retrieval and
text processing for answer candidates, and evaluation through examples; it
goes on with a theoretical development of the topics. The second part of the
chapter considers the current developments QA is now addressing. One of the
topic is corpus-related research in order to achieve improvements in the
“understanding the question” problem; on the other hand, the subsequent
sections focus on the improvement of choice of answers through user's
analysis, by analyzing how different users might judge different answers as
correct, or by solving the semantic ambiguity of the questions. The chapter
closes with a discussion on QA systems evaluation, concentrating on the
possible need for new and better evaluation methods for QA systems.

EVALUATION

The book is a wonderful work both from the point of view of content and form.
Compared to other manuals, it probably covers the broadest panorama in
state-of-the-art NLP and CL, thus becoming (one of) the most complete manuals
on these areas.

Because of the aim of covering such a broad field as NLP and CL, some chapters
might seem a bit loosely related to one another. This is inevitable in a work
that is organized in 22 chapters that cover something of such an amplitude as
(almost all) NLP.

Even though the topics of the chapters range from Formal Language Theory to
Machine Learning, to Morphophonology, to Parsing, the structure of the manual
itself is solid and the work is well organized. In some cases, a stronger set
of cross references could have added to usability, even though the direct
linkage between the main topics across the chapters is always present.

Among the book’s qualities, besides its completeness and the wide range of
topics treated, other points of strength should be mentioned: the constant
development of the chapters making a parallel between theory and practice is
definitely a plus, being for the majority of the topics a smooth “crescendo”.
Nonetheless, it could be noted for some chapters that the jump from theory to
applications might be somehow rough or abrupt for somebody who is unfamiliar
with the topic.

It is difficult to find negative points in the work; something that might be
observed, more from an editorial point of view, is the reference section
condensed at the end of the book, resulting in a little cumbersome 86
double-column pages. Taking into account the broad coverage of the 22
chapters, it would be easier for the user to search through references if they
appeared at the end of each chapter, thus limiting the searching to the topic
the reader is interested in. It is nonetheless understandable that this choice
would have brought to a considerable redundancy in some cases, and therefore
it might simply be a space issue.

For the same reason, the editors omit some potentially useful tools, like a
list of formulas and equations (maybe also containing algorithms), and even a
list of acronyms, which might have increased hugely the usability of the
manual. It should be kept in mind, in fact, that it will probably serve as a
reference manual, not necessarily a book to be read from beginning to end. On
the plus side, the manual provides a complete List of Figures, a List of
Tables, an Author Index and an even more useful Subject Index to compensate
for the unavoidable density of the chapters.

The last observation refers to some differences between the single chapters,
where the structure is sometimes a little different. A somehow slightly firmer
template, implying overview, state-of-the-art, further reading sections for
all chapters could have helped in giving a more uniform micro-structure thus
improving usability and decreasing searching time. Again, this is not a
content issue, since each chapter presents all this information, they are just
organized (or named) in a slightly different way.

The overall evaluation is therefore definitely very good: the work is solid,
complete and definitely an important reference for NLP and CL.

REFERENCES

Grosz, B.J., Joshi K. Aravind, & Scott Weinstein. 1995. Centering: a framework
for modelling the local coherence of discourse. Computational Linguistics,
21(2):203-25.

Mann, William C. & Sara A. Thompson. 1988. Rhetorical Structure Theory:
towards a functional theory of text organization. Text 3:243:81.

Sampson, Geoffrey. 1986. A Stochastic Approach to Parsing, in Proceedings of
the 11th International Conference on Computational Linguistics, 151-5.

ABOUT THE REVIEWER

Mauro Costantino is invited professor at the Universidad Mayor de San Andrés
(UMSA) of La Paz, at the Universidad Pública de El Alto (UPEA). His main
interests range from Second Language Acquisition, comparing the acquisition of
the Italian verb system by speakers of different languages, to Translation
Studies, to corpus linguistics (focusing on learners corpora). He teaches
Italian, translations seminar and introduction to computational and corpus
linguistic at UMSA, as well as organizing new introductory experimental
seminars in computational and corpus linguistics at UPEA. Besides actively
cooperates to the VALICO (www.valico.org) and VALERE (www.valere.org) projects
from the University of Torino (Italy) he is working at various projects (one
in translation and one in corpus implementation) with the Literature
Department at UMSA. In his “free time” he is translator and general secretary
of the Società Dante Alighieri of La Paz, Bolivia.

----------------------------------------------------------
LINGUIST List: Vol-25-3094	
----------------------------------------------------------