531 research outputs found
Sanity Checks for Saliency Metrics
Saliency maps are a popular approach to creating post-hoc explanations of
image classifier outputs. These methods produce estimates of the relevance of
each pixel to the classification output score, which can be displayed as a
saliency map that highlights important pixels. Despite a proliferation of such
methods, little effort has been made to quantify how good these saliency maps
are at capturing the true relevance of the pixels to the classifier output
(i.e. their "fidelity"). We therefore investigate existing metrics for
evaluating the fidelity of saliency methods (i.e. saliency metrics). We find
that there is little consistency in the literature in how such metrics are
calculated, and show that such inconsistencies can have a significant effect on
the measured fidelity. Further, we apply measures of reliability developed in
the psychometric testing literature to assess the consistency of saliency
metrics when applied to individual saliency maps. Our results show that
saliency metrics can be statistically unreliable and inconsistent, indicating
that comparative rankings between saliency methods generated using such metrics
can be untrustworthy.Comment: Accepted for publication at the Thirty Fourth AAAI conference on
Artificial Intelligence (AAAI-20
Benchmarking Perturbation-based Saliency Maps for Explaining Deep Reinforcement Learning Agents
Recent years saw a plethora of work on explaining complex intelligent agents.
One example is the development of several algorithms that generate saliency
maps which show how much each pixel attributed to the agents' decision.
However, most evaluations of such saliency maps focus on image classification
tasks. As far as we know, there is no work which thoroughly compares different
saliency maps for Deep Reinforcement Learning agents. This paper compares four
perturbation-based approaches to create saliency maps for Deep Reinforcement
Learning agents trained on four different Atari 2600 games. All four approaches
work by perturbing parts of the input and measuring how much this affects the
agent's output. The approaches are compared using three computational metrics:
dependence on the learned parameters of the agent (sanity checks), faithfulness
to the agent's reasoning (input degradation), and run-time.Comment: Presented on the Explainable Agency in Artificial Intelligence
Workshop during the 35th AAAI Conference on Artificial Intelligenc
Local and Global Explanations of Agent Behavior: Integrating Strategy Summaries with Saliency Maps
With advances in reinforcement learning (RL), agents are now being developed
in high-stakes application domains such as healthcare and transportation.
Explaining the behavior of these agents is challenging, as the environments in
which they act have large state spaces, and their decision-making can be
affected by delayed rewards, making it difficult to analyze their behavior. To
address this problem, several approaches have been developed. Some approaches
attempt to convey the behavior of the agent, describing the
actions it takes in different states. Other approaches devised
explanations which provide information regarding the agent's decision-making in
a particular state. In this paper, we combine global and local explanation
methods, and evaluate their joint and separate contributions, providing (to the
best of our knowledge) the first user study of combined local and global
explanations for RL agents. Specifically, we augment strategy summaries that
extract important trajectories of states from simulations of the agent with
saliency maps which show what information the agent attends to. Our results
show that the choice of what states to include in the summary (global
information) strongly affects people's understanding of agents: participants
shown summaries that included important states significantly outperformed
participants who were presented with agent behavior in a randomly set of chosen
world-states. We find mixed results with respect to augmenting demonstrations
with saliency maps (local information), as the addition of saliency maps did
not significantly improve performance in most cases. However, we do find some
evidence that saliency maps can help users better understand what information
the agent relies on in its decision making, suggesting avenues for future work
that can further improve explanations of RL agents
Sanity Checks for Saliency Methods Explaining Object Detectors
Saliency methods are frequently used to explain Deep Neural Network-based models. Adebayo et al.'s work on evaluating saliency methods for classification models illustrate certain explanation methods fail the model and data randomization tests. However, on extending the tests for various state of the art object detectors we illustrate that the ability to explain a model is more dependent on the model itself than the explanation method. We perform sanity checks for object detection and define new qualitative criteria to evaluate the saliency explanations, both for object classification and bounding box decisions, using Guided Backpropagation, Integrated Gradients, and their Smoothgrad versions, together with Faster R-CNN, SSD, and EfficientDet-D0, trained on COCO. In addition, the sensitivity of the explanation method to model parameters and data labels varies class-wise motivating to perform the sanity checks for each class. We find that EfficientDet-D0 is the most interpretable method independent of the saliency method, which passes the sanity checks with little problems
Sanity Checks for Saliency Methods Explaining Object Detectors
Saliency methods are frequently used to explain Deep Neural Network-based models. Adebayo et al.'s work on evaluating saliency methods for classification models illustrate certain explanation methods fail the model and data randomization tests. However, on extending the tests for various state of the art object detectors we illustrate that the ability to explain a model is more dependent on the model itself than the explanation method. We perform sanity checks for object detection and define new qualitative criteria to evaluate the saliency explanations, both for object classification and bounding box decisions, using Guided Backpropagation, Integrated Gradients, and their Smoothgrad versions, together with Faster R-CNN, SSD, and EfficientDet-D0, trained on COCO. In addition, the sensitivity of the explanation method to model parameters and data labels varies class-wise motivating to perform the sanity checks for each class. We find that EfficientDet-D0 is the most interpretable method independent of the saliency method, which passes the sanity checks with little problems
Sanity Checks for Saliency Methods Explaining Object Detectors
Saliency methods are frequently used to explain Deep Neural Network-based models. Adebayo et al.'s work on evaluating saliency methods for classification models illustrate certain explanation methods fail the model and data randomization tests. However, on extending the tests for various state of the art object detectors we illustrate that the ability to explain a model is more dependent on the model itself than the explanation method. We perform sanity checks for object detection and define new qualitative criteria to evaluate the saliency explanations, both for object classification and bounding box decisions, using Guided Backpropagation, Integrated Gradients, and their Smoothgrad versions, together with Faster R-CNN, SSD, and EfficientDet-D0, trained on COCO. In addition, the sensitivity of the explanation method to model parameters and data labels varies class-wise motivating to perform the sanity checks for each class. We find that EfficientDet-D0 is the most interpretable method independent of the saliency method, which passes the sanity checks with little problems
- …