1,830 research outputs found
Influence-Directed Explanations for Deep Convolutional Networks
We study the problem of explaining a rich class of behavioral properties of
deep neural networks. Distinctively, our influence-directed explanations
approach this problem by peering inside the network to identify neurons with
high influence on a quantity and distribution of interest, using an
axiomatically-justified influence measure, and then providing an interpretation
for the concepts these neurons represent. We evaluate our approach by
demonstrating a number of its unique capabilities on convolutional neural
networks trained on ImageNet. Our evaluation demonstrates that
influence-directed explanations (1) identify influential concepts that
generalize across instances, (2) can be used to extract the "essence" of what
the network learned about a class, and (3) isolate individual features the
network uses to make decisions and distinguish related classes.Comment: To appear in International Test Conference 201
Towards Better Interpretability in Deep Q-Networks
Deep reinforcement learning techniques have demonstrated superior performance
in a wide variety of environments. As improvements in training algorithms
continue at a brisk pace, theoretical or empirical studies on understanding
what these networks seem to learn, are far behind. In this paper we propose an
interpretable neural network architecture for Q-learning which provides a
global explanation of the model's behavior using key-value memories, attention
and reconstructible embeddings. With a directed exploration strategy, our model
can reach training rewards comparable to the state-of-the-art deep Q-learning
models. However, results suggest that the features extracted by the neural
network are extremely shallow and subsequent testing using out-of-sample
examples shows that the agent can easily overfit to trajectories seen during
training.Comment: Accepted at AAAI-19; (16 pages, 18 figures
Interpretable Learning for Self-Driving Cars by Visualizing Causal Attention
Deep neural perception and control networks are likely to be a key component
of self-driving vehicles. These models need to be explainable - they should
provide easy-to-interpret rationales for their behavior - so that passengers,
insurance companies, law enforcement, developers etc., can understand what
triggered a particular behavior. Here we explore the use of visual
explanations. These explanations take the form of real-time highlighted regions
of an image that causally influence the network's output (steering control).
Our approach is two-stage. In the first stage, we use a visual attention model
to train a convolution network end-to-end from images to steering angle. The
attention model highlights image regions that potentially influence the
network's output. Some of these are true influences, but some are spurious. We
then apply a causal filtering step to determine which input regions actually
influence the output. This produces more succinct visual explanations and more
accurately exposes the network's behavior. We demonstrate the effectiveness of
our model on three datasets totaling 16 hours of driving. We first show that
training with attention does not degrade the performance of the end-to-end
network. Then we show that the network causally cues on a variety of features
that are used by humans while driving
Explaining Deep Learning Models using Causal Inference
Although deep learning models have been successfully applied to a variety of
tasks, due to the millions of parameters, they are becoming increasingly opaque
and complex. In order to establish trust for their widespread commercial use,
it is important to formalize a principled framework to reason over these
models. In this work, we use ideas from causal inference to describe a general
framework to reason over CNN models. Specifically, we build a Structural Causal
Model (SCM) as an abstraction over a specific aspect of the CNN. We also
formulate a method to quantitatively rank the filters of a convolution layer
according to their counterfactual importance. We illustrate our approach with
popular CNN architectures such as LeNet5, VGG19, and ResNet32
Recommended from our members
Explainable and Advisable Learning for Self-driving Vehicles
Deep neural perception and control networks are likely to be a key component of self-driving vehicles. These models need to be explainable - they should provide easy-to-interpret rationales for their behavior - so that passengers, insurance companies, law enforcement, developers, etc., can understand what triggered a particular behavior. Explanations may be triggered by the neural controller, namely introspective explanations, or informed by the neural controller's output, namely rationalizations. Our work has focused on the challenge of generating introspective explanations of deep models for self-driving vehicles. In Chapter 3, we begin by exploring the use of visual explanations. These explanations take the form of real-time highlighted regions of an image that causally influence the network's output (steering control). In the first stage, we use a visual attention model to train a convolution network end-to-end from images to steering angle. The attention model highlights image regions that potentially influence the network's output. Some of these are true influences, but some are spurious. We then apply a causal filtering step to determine which input regions actually influence the output. This produces more succinct visual explanations and more accurately exposes the network's behavior. In Chapter 4, we add an attention-based video-to-text model to produce textual explanations of model actions, e.g. "the car slows down because the road is wet". The attention maps of controller and explanation model are aligned so that explanations are grounded in the parts of the scene that mattered to the controller. We explore two approaches to attention alignment, strong- and weak-alignment. These explainable systems represent an externalization of tacit knowledge. The network's opaque reasoning is simplified to a situation-specific dependence on a visible object in the image. This makes them brittle and potentially unsafe in situations that do not match training data. In Chapter 5, we propose to address this issue by augmenting training data with natural language advice from a human. Advice includes guidance about what to do and where to attend. We present the first step toward advice-giving, where we train an end-to-end vehicle controller that accepts advice. The controller adapts the way it attends to the scene (visual attention) and the control (steering and speed). Further, in Chapter 6, we propose a new approach that learns vehicle control with the help of long-term (global) human advice. Specifically, our system learns to summarize its visual observations in natural language, predict an appropriate action response (e.g. "I see a pedestrian crossing, so I stop"), and predict the controls, accordingly
Linked Dynamic Graph CNN: Learning on Point Cloud via Linking Hierarchical Features
Learning on point cloud is eagerly in demand because the point cloud is a
common type of geometric data and can aid robots to understand environments
robustly. However, the point cloud is sparse, unstructured, and unordered,
which cannot be recognized accurately by a traditional convolutional neural
network (CNN) nor a recurrent neural network (RNN). Fortunately, a graph
convolutional neural network (Graph CNN) can process sparse and unordered data.
Hence, we propose a linked dynamic graph CNN (LDGCNN) to classify and segment
point cloud directly in this paper. We remove the transformation network, link
hierarchical features from dynamic graphs, freeze feature extractor, and
retrain the classifier to increase the performance of LDGCNN. We explain our
network using theoretical analysis and visualization. Through experiments, we
show that the proposed LDGCNN achieves state-of-art performance on two standard
datasets: ModelNet40 and ShapeNet
Counterfactuals uncover the modular structure of deep generative models
Deep generative models can emulate the perceptual properties of complex image
datasets, providing a latent representation of the data. However, manipulating
such representation to perform meaningful and controllable transformations in
the data space remains challenging without some form of supervision. While
previous work has focused on exploiting statistical independence to disentangle
latent factors, we argue that such requirement is too restrictive and propose
instead a non-statistical framework that relies on counterfactual manipulations
to uncover a modular structure of the network composed of disentangled groups
of internal variables. Experiments with a variety of generative models trained
on complex image datasets show the obtained modules can be used to design
targeted interventions. This opens the way to applications such as
computationally efficient style transfer and the automated assessment of
robustness to contextual changes in pattern recognition systems.Comment: 26 pages, 17 figure
Transfer entropy-based feedback improves performance in artificial neural networks
The structure of the majority of modern deep neural networks is characterized
by uni- directional feed-forward connectivity across a very large number of
layers. By contrast, the architecture of the cortex of vertebrates contains
fewer hierarchical levels but many recurrent and feedback connections. Here we
show that a small, few-layer artificial neural network that employs feedback
will reach top level performance on a standard benchmark task, otherwise only
obtained by large feed-forward structures. To achieve this we use feed-forward
transfer entropy between neurons to structure feedback connectivity. Transfer
entropy can here intuitively be understood as a measure for the relevance of
certain pathways in the network, which are then amplified by feedback. Feedback
may therefore be key for high network performance in small brain-like
architectures
An Adversarial Approach for Explaining the Predictions of Deep Neural Networks
Machine learning models have been successfully applied to a wide range of
applications including computer vision, natural language processing, and speech
recognition. A successful implementation of these models however, usually
relies on deep neural networks (DNNs) which are treated as opaque black-box
systems due to their incomprehensible complexity and intricate internal
mechanism. In this work, we present a novel algorithm for explaining the
predictions of a DNN using adversarial machine learning. Our approach
identifies the relative importance of input features in relation to the
predictions based on the behavior of an adversarial attack on the DNN. Our
algorithm has the advantage of being fast, consistent, and easy to implement
and interpret. We present our detailed analysis that demonstrates how the
behavior of an adversarial attack, given a DNN and a task, stays consistent for
any input test data point proving the generality of our approach. Our analysis
enables us to produce consistent and efficient explanations. We illustrate the
effectiveness of our approach by conducting experiments using a variety of
DNNs, tasks, and datasets. Finally, we compare our work with other well-known
techniques in the current literature
Knowledge Graph Embeddings and Explainable AI
Knowledge graph embeddings are now a widely adopted approach to knowledge
representation in which entities and relationships are embedded in vector
spaces. In this chapter, we introduce the reader to the concept of knowledge
graph embeddings by explaining what they are, how they can be generated and how
they can be evaluated. We summarize the state-of-the-art in this field by
describing the approaches that have been introduced to represent knowledge in
the vector space. In relation to knowledge representation, we consider the
problem of explainability, and discuss models and methods for explaining
predictions obtained via knowledge graph embeddings.Comment: Federico Bianchi, Gaetano Rossiello, Luca Costabello, Matteo
Plamonari, Pasquale Minervini, Knowledge Graph Embeddings and Explainable AI.
In: Ilaria Tiddi, Freddy Lecue, Pascal Hitzler (eds.), Knowledge Graphs for
eXplainable AI -- Foundations, Applications and Challenges. Studies on the
Semantic Web, IOS Press, Amsterdam, 202
- …