35.200, Review: Explainable Natural Language Processing: Søgaard (2021)

Tue Jan 16 16:05:04 UTC 2024

LINGUIST List: Vol-35-200. Tue Jan 16 2024. ISSN: 1069 - 4875.

Subject: 35.200, Review: Explainable Natural Language Processing: Søgaard (2021)

Moderators: Malgorzata E. Cavar, Francis Tyers (linguist at linguistlist.org)
Managing Editor: Justin Fuller
Team: Helen Aristar-Dry, Steven Franks, Everett Green, Daniel Swanson, Maria Lucero Guillen Puon, Zackary Leech, Lynzie Coburn, Natasha Singh, Erin Steitz
Jobs: jobs at linguistlist.org | Conferences: callconf at linguistlist.org | Pubs: pubs at linguistlist.org

Homepage: http://linguistlist.org

Please support the LL editors and operation with a donation at:
           https://funddrive.linguistlist.org/donate/

Editor for this issue: Justin Fuller <justin at linguistlist.org>
================================================================

Date: 16-Jan-2024
From: Viatcheslav Yatsko [iatsko at gmail.com]
Subject: Computational Linguistics: Søgaard (2021)

Book announced at https://linguistlist.org/issues/32.3671

AUTHOR: Anders Søgaard
TITLE: Explainable Natural Language Processing
SERIES TITLE: Synthesis Lectures on Human Language Technologies
PUBLISHER: Morgan & Claypool Publishers
YEAR: 2021

REVIEWER: Viatcheslav Yatsko

SUMMARY

This book focuses on a new subfield of NLP that has been intensely
developing during the last decade. Wide use of neural networks and
deep learning models that apply black box techniques gave rise to
Explainable Natural Language Processing that aims at explaining to the
user how the system obtained the given result, which is important for
building trust in the system's functioning and getting important
feedback for improving its quality [1]. The author of the book
presents a taxonomy of approaches to explainable NLP to accelerate
progress in this emerging subfield.

The book comprises the Introduction (the author also calls it "Chapter
1") and twelve chapters, as well as links to programming resources
(Chapter 13), and an extensive bibliography that includes references
to the works dealing with the subfield's problems.

In the Introduction the author shows the importance of taxonomies for
structuring the newly formed subfield and also reviews existing
approaches to Explainable NLP. He distinguishes between two main types
of explanation methods, viz. local-global and forward-backward. Local
methods are typically interested in behavior on specific samples;
global methods typically train new parameters on larger samples to
evaluate the learned representations globally. Local methods are often
used to explain the motivation behind critical decisions, (e.g., why a
customer was assessed as high risk), whereas global methods are used
to characterize biases in models and evaluate their robustness.

Some methods rely on forward passes over the parameters, while the
other ones rely on backward passes. An interpretability method is said
to be backward if it relies solely on quantities derived from one or
more backward passes through the instances; otherwise, if it relies on
quantities from forward passes, it is said to be forward. Highlighting
which parts of an input are most responsible for a prediction clearly
falls into the class of local approaches that rely on backward passes.
In contrast, attention head pruning falls into the class of global
approaches that focus on forward passes.

Chapter 2 "A Framework for Explainable NLP" first focuses on standard
architectures in NLP that underlie the explanation process and then
introduces a framework for the taxonomy of explanation methods. The
author briefly describes linear classification, nonlinear
classification, recurrent models, and transformers to concentrate on
the specific features of global and local explanations that can rely
on backward or forward passes through the neural networks they seek to
explain. The class of local-forward methods is further subdivided into
local methods that explain models locally by forward passing weights
to form (a) intermediate representations, (b) continuous output, or
(c) discrete output; and global methods are subdivided into global
methods that explain models globally in the same way. This, in total,
leads to eight categories.The chapter comprises six sections, NLP
Architectures, Local and Global Explanations, Backward Methods,
Forward Explaining by Intermediate Representations, Forward Explaining
by Continuous Outputs, and Forward Explaining by Discrete Outputs.
The next chapters of the book (Chapters 3-10) provide detailed
descriptions of the eight categories.
Chapter 3 focuses on local-backward explanations that use training
signals or training dynamics to directly explain model decisions. The
explanations are direct, in the sense that they do not require
induction of additional parameters, and local, in that the
explanations do not aim to generalize across representative samples of
data: each data point is explained on its own terms. The chapter has
six sections, Vanilla Gradients, Guided Back Propagation, Layer-Wise
Relevance Propagation, Deep Taylor Decomposition, Integrated
Gradients, and DeepLift.
Chapter 4 describes global-backward explanations concentrating on
pruning methods that involve the removal of gates or attention heads,
and input tokens. The author distinguishes between two types of
pruning methods: unstructured pruning methods that prune weights one
at a time, disregarding the overall structure of the network, and
structured pruning methods that prune weights in groups as defined by
the neural network architecture. Attention head pruning  is an example
of the latter. While local methods can be used to identify candidate
weights to prune, all pruning methods are global, since they change
the set of model parameters. Pruning methods also differ in when
weights are pruned (before, during or after training), whether
multiple iterations of pruning are performed, and whether candidate
weights are identified by raw magnitudes or gradients, or whether they
are somehow learned. The chapter has four sections, Post-hoc
Unstructured Pruning, Lottery Tickets, Dynamic Sparse Training, and
Binary Networks and Sparse Coding.
Chapter 5 deals with local-forward explanations of intermediate
representations focusing on methods to visualize or interpret gates
and attention in recurrent architectures, as well as attention in
transformer architectures. The chapter contains five sections, Gates,
Attention, Attention Roll-out and Attention Flow, Layer-wise Attention
Tracing, and Attention Decoding.
Chapter 6 describes global-forward explanations of intermediate
representations concentrating on pruning strategies that are
classified into (a) methods for obtaining simpler models that are more
likely to be comprehended holistically, and (b) ways to evaluate local
explainability methods based on training dynamics. The chapter
includes two sections, Gate Pruning, and Attention Head Pruning.
Chapter 7 hinges upon local-forward explanations of continuous output
presenting different techniques for interpreting neural networks at
the level of input encodings or continuous output vectors. Neural
networks produce text encodings, i.e., continuous output vectors,
which can be used directly for a range of tasks, e.g., synonymy
detection, word alignment, bilingual dictionary induction, sentence
retrieval, document retrieval. The chapter has three sections, Word
Association Norms, Word Analogies, and Time Step Dynamics.
Chapter 8 tells about global-forward explanations of continuous
output. Neural networks produce vectors or representations. On a
sample of input examples, they produce distributions of vectors. Such
point clouds of vectors can also be interpreted. The author proves
that in the case of two such clouds, it is possible 1) to quantify the
extent to which they are structurally similar, 2) to learn clusters of
vectors and analyze the clusters manually, and 3) to use these to
compute functions that enable us to extract influential data points
for the test examples. The chapter has five sections, Correlation of
Representations, Clustering, Probing Classifiers, Concept Activation,
and Influential Examples.
Chapter 9 discusses local-forward explanations of discrete output
including a local explanation method called LIME and its offspring.
These methods perform uptraining on perturbations of a single example.
This approximates the decision boundary locally, but is different from
standard uptraining in that it does not rely on random samples. The
chapter has three sections, Challenge Datasets, Local Uptraining, and
Influential Examples.
Chapter 10 describes global-forward explanations of discrete output.
It discusses a strategy for learning simple approximations of neural
networks called uptraining. Uptraining refers to the idea of training
a simple model on the output of the more complex model "h" to obtain
enough supervision to learn a good approximation of "h". The chapter
contains three sections, Uptraining, Meta-Analysis, and Downstream
Evaluation.
In Chapter 11 "Evaluating Explanations" the author discusses the
different evaluation methodologies that have been proposed by the
experts in the field and suggests using heuristics, human annotations,
and human experiments for the evaluation purposes. The chapter
includes four sections, Flavors of Explanations, Heuristics, Human
Annotations, and Human Experiments.

Chapter 12 "Perspectives" contains eleven 'Observations', i.e.
axiomatic statements such as "Local methods can be applied globally,
whereas global methods cannot be applied locally" that give general
information about classes of methods which the experts are not aware
of or that will be proposed in the future.

I was unpleasantly surprised to find that Bibliography items were
sorted in order of their appearance rather than in alphabetical order.
The reader using a paper version is sure to have difficulties in
finding the items, in which he/she is interested.

EVALUATION

The book written by Anders Søgaard presents a valuable source that can
be used as a guide to the emerging field of Explainable NLP. The work
is characterized by logical and consistent structure, being
illustrated with tables and influential examples. I cannot help
agreeing with the author when he writes about the importance of
taxonomies because I myself undertook several attempts to classify NLP
technologies, see e.g. [3]. Nevertheless, the author never gives a
definition of explainable NLP. Basing on the author's work and on
[2,3] I will take the liberty to formulate the following definition:
Explainable natural language processing is a branch of computational
linguistics that focuses on the interpretation of the black box
techniques to provide important feedback for the user. "Black box
techniques" in this definition mostly involve the use of deep learning
models and language embeddings as features; "interpretation" usually
consists in assessing the efficiency of the techniques to help the
users build trust in the NLP systems and get important feedback to
improve their quality.

REFERENCES
1. Danilevsky, M., Qian K., Aharonov R., Katsis Y., Kawas B., and Sen
P. (2020) A Survey of the state of explainable AI for natural language
processing. In: Proceedings of the 1st conference of the Asia-Pacific
Chapter of the Association for Computational Linguistics and the 10th
International Joint Conference on Natural Language Processing, pages
447–459.
2. Tiya V. Explainable NLP : Why do we need this? (2022). URL:https://
www.linkedin.com/pulse/explainable-nlp-why-do-we-need-tiya-vaj
3. Yatsko V.A. (2020) The criteria for classification of linguistic
technologies. In: Nauchno-technicheskay informatsia, n. 8, pages
30-38. URL: https://www.researchgate.net/publication/349329351_Kriteri
i_klassifikacii_lingvisticeskih_tehnologijTHE_CRITERIA_FOR_CLASSIFICAT
ION_OF_LINGUISTIC_TECHNOLOGIES

ABOUT THE REVIEWER

Viatcheslav Yatsko, ScD, is a non-affiliated independent researcher.

------------------------------------------------------------------------------

Please consider donating to the Linguist List https://give.myiu.org/iu-bloomington/I320011968.html

LINGUIST List is supported by the following publishers:

Cambridge University Press http://www.cambridge.org/linguistics

Multilingual Matters http://www.multilingual-matters.com/

Wiley http://www.wiley.com

----------------------------------------------------------
LINGUIST List: Vol-35-200
----------------------------------------------------------