3,866 research outputs found
Interpreting Neural Networks With Nearest Neighbors
Local model interpretation methods explain individual predictions by
assigning an importance value to each input feature. This value is often
determined by measuring the change in confidence when a feature is removed.
However, the confidence of neural networks is not a robust measure of model
uncertainty. This issue makes reliably judging the importance of the input
features difficult. We address this by changing the test-time behavior of
neural networks using Deep k-Nearest Neighbors. Without harming text
classification accuracy, this algorithm provides a more robust uncertainty
metric which we use to generate feature importance values. The resulting
interpretations better align with human perception than baseline methods.
Finally, we use our interpretation method to analyze model predictions on
dataset annotation artifacts.Comment: EMNLP 2018 BlackboxNL
When Explanations Lie: Why Many Modified BP Attributions Fail
Attribution methods aim to explain a neural network's prediction by
highlighting the most relevant image areas. A popular approach is to
backpropagate (BP) a custom relevance score using modified rules, rather than
the gradient. We analyze an extensive set of modified BP methods: Deep Taylor
Decomposition, Layer-wise Relevance Propagation (LRP), Excitation BP,
PatternAttribution, DeepLIFT, Deconv, RectGrad, and Guided BP. We find
empirically that the explanations of all mentioned methods, except for
DeepLIFT, are independent of the parameters of later layers. We provide
theoretical insights for this surprising behavior and also analyze why DeepLIFT
does not suffer from this limitation. Empirically, we measure how information
of later layers is ignored by using our new metric, cosine similarity
convergence (CSC). The paper provides a framework to assess the faithfulness of
new and existing modified BP methods theoretically and empirically. For code
see: https://github.com/berleon/when-explanations-lieComment: Published in ICML 2020, Camera Ready Versio
Local Explanation Methods for Deep Neural Networks Lack Sensitivity to Parameter Values
Explaining the output of a complicated machine learning model like a deep
neural network (DNN) is a central challenge in machine learning. Several
proposed local explanation methods address this issue by identifying what
dimensions of a single input are most responsible for a DNN's output. The goal
of this work is to assess the sensitivity of local explanations to DNN
parameter values. Somewhat surprisingly, we find that DNNs with
randomly-initialized weights produce explanations that are both visually and
quantitatively similar to those produced by DNNs with learned weights. Our
conjecture is that this phenomenon occurs because these explanations are
dominated by the lower level features of a DNN, and that a DNN's architecture
provides a strong prior which significantly affects the representations learned
at these lower layers. NOTE: This work is now subsumed by our recent
manuscript, Sanity Checks for Saliency Maps (to appear NIPS 2018), where we
expand on findings and address concerns raised in Sundararajan et. al. (2018).Comment: Workshop Track International Conference on Learning Representations
(ICLR
A Categorisation of Post-hoc Explanations for Predictive Models
The ubiquity of machine learning based predictive models in modern society
naturally leads people to ask how trustworthy those models are? In predictive
modeling, it is quite common to induce a trade-off between accuracy and
interpretability. For instance, doctors would like to know how effective some
treatment will be for a patient or why the model suggested a particular
medication for a patient exhibiting those symptoms? We acknowledge that the
necessity for interpretability is a consequence of an incomplete formalisation
of the problem, or more precisely of multiple meanings adhered to a particular
concept. For certain problems, it is not enough to get the answer (what), the
model also has to provide an explanation of how it came to that conclusion
(why), because a correct prediction, only partially solves the original
problem. In this article we extend existing categorisation of techniques to aid
model interpretability and test this categorisation.Comment: 5 pages, 3 figures, AAAI 2019 Spring Symposia (#SSS19
Towards Prediction Explainability through Sparse Communication
Explainability is a topic of growing importance in NLP. In this work, we
provide a unified perspective of explainability as a communication problem
between an explainer and a layperson about a classifier's decision. We use this
framework to compare several prior approaches for extracting explanations,
including gradient methods, representation erasure, and attention mechanisms,
in terms of their communication success. In addition, we reinterpret these
methods at the light of classical feature selection, and we use this as
inspiration to propose new embedded methods for explainability, through the use
of selective, sparse attention. Experiments in text classification, natural
language entailment, and machine translation, using different configurations of
explainers and laypeople (including both machines and humans), reveal an
advantage of attention-based explainers over gradient and erasure methods.
Furthermore, human evaluation experiments show promising results with post-hoc
explainers trained to optimize communication success and faithfulness
Explaining a black-box using Deep Variational Information Bottleneck Approach
Interpretable machine learning has gained much attention recently. Briefness
and comprehensiveness are necessary in order to provide a large amount of
information concisely when explaining a black-box decision system. However,
existing interpretable machine learning methods fail to consider briefness and
comprehensiveness simultaneously, leading to redundant explanations. We propose
the variational information bottleneck for interpretation, VIBI, a
system-agnostic interpretable method that provides a brief but comprehensive
explanation. VIBI adopts an information theoretic principle, information
bottleneck principle, as a criterion for finding such explanations. For each
instance, VIBI selects key features that are maximally compressed about an
input (briefness), and informative about a decision made by a black-box system
on that input (comprehensive). We evaluate VIBI on three datasets and compare
with state-of-the-art interpretable machine learning methods in terms of both
interpretability and fidelity evaluated by human and quantitative metric
Detecting and interpreting myocardial infarction using fully convolutional neural networks
Objective: We aim to provide an algorithm for the detection of myocardial
infarction that operates directly on ECG data without any preprocessing and to
investigate its decision criteria. Approach: We train an ensemble of fully
convolutional neural networks on the PTB ECG dataset and apply state-of-the-art
attribution methods. Main results: Our classifier reaches 93.3% sensitivity and
89.7% specificity evaluated using 10-fold cross-validation with sampling based
on patients. The presented method outperforms state-of-the-art approaches and
reaches the performance level of human cardiologists for detection of
myocardial infarction. We are able to discriminate channel-specific regions
that contribute most significantly to the neural network's decision.
Interestingly, the network's decision is influenced by signs also recognized by
human cardiologists as indicative of myocardial infarction. Significance: Our
results demonstrate the high prospects of algorithmic ECG analysis for future
clinical applications considering both its quantitative performance as well as
the possibility of assessing decision criteria on a per-example basis, which
enhances the comprehensibility of the approach.Comment: 11 pages, 4 figure
Evaluating Recurrent Neural Network Explanations
Recently, several methods have been proposed to explain the predictions of
recurrent neural networks (RNNs), in particular of LSTMs. The goal of these
methods is to understand the network's decisions by assigning to each input
variable, e.g., a word, a relevance indicating to which extent it contributed
to a particular prediction. In previous works, some of these methods were not
yet compared to one another, or were evaluated only qualitatively. We close
this gap by systematically and quantitatively comparing these methods in
different settings, namely (1) a toy arithmetic task which we use as a sanity
check, (2) a five-class sentiment prediction of movie reviews, and besides (3)
we explore the usefulness of word relevances to build sentence-level
representations. Lastly, using the method that performed best in our
experiments, we show how specific linguistic phenomena such as the negation in
sentiment analysis reflect in terms of relevance patterns, and how the
relevance visualization can help to understand the misclassification of
individual samples.Comment: 14 pages, accepted for ACL'19 Workshop BlackboxNLP: Analyzing and
Interpreting Neural Networks for NL
An Empirical Study towards Understanding How Deep Convolutional Nets Recognize Falls
Detecting unintended falls is essential for ambient intelligence and
healthcare of elderly people living alone. In recent years, deep convolutional
nets are widely used in human action analysis, based on which a number of fall
detection methods have been proposed. Despite their highly effective
performances, the behaviors of how the convolutional nets recognize falls are
still not clear. In this paper, instead of proposing a novel approach, we
perform a systematical empirical study, attempting to investigate the
underlying fall recognition process. We propose four tasks to investigate,
which involve five types of input modalities, seven net instances and different
training samples. The obtained quantitative and qualitative results reveal the
patterns that the nets tend to learn, and several factors that can heavily
influence the performances on fall recognition. We expect that our conclusions
are favorable to proposing better deep learning solutions to fall detection
systems.Comment: published at the sixth International Workshop on Assistive Computer
Vision and Robotics (ACVR), in conjunction with European Conference on
Computer Vision (ECCV), Munich, 201
HDLTex: Hierarchical Deep Learning for Text Classification
The continually increasing number of documents produced each year
necessitates ever improving information processing methods for searching,
retrieving, and organizing text. Central to these information processing
methods is document classification, which has become an important application
for supervised learning. Recently the performance of these traditional
classifiers has degraded as the number of documents has increased. This is
because along with this growth in the number of documents has come an increase
in the number of categories. This paper approaches this problem differently
from current document classification methods that view the problem as
multi-class classification. Instead we perform hierarchical classification
using an approach we call Hierarchical Deep Learning for Text classification
(HDLTex). HDLTex employs stacks of deep learning architectures to provide
specialized understanding at each level of the document hierarchy.Comment: ICMLA 201
- …