591 research outputs found
TraCE: Trajectory Counterfactual Explanation Scores
Counterfactual explanations, and their associated algorithmic recourse, are
typically leveraged to understand, explain, and potentially alter a prediction
coming from a black-box classifier. In this paper, we propose to extend the use
of counterfactuals to evaluate progress in sequential decision making tasks. To
this end, we introduce a model-agnostic modular framework, TraCE (Trajectory
Counterfactual Explanation) scores, which is able to distill and condense
progress in highly complex scenarios into a single value. We demonstrate
TraCE's utility across domains by showcasing its main properties in two case
studies spanning healthcare and climate change.Comment: 7 pages, 4 figures, appendi
Counterfactual Explanation Policies in RL
As Reinforcement Learning (RL) agents are increasingly employed in diverse
decision-making problems using reward preferences, it becomes important to
ensure that policies learned by these frameworks in mapping observations to a
probability distribution of the possible actions are explainable. However,
there is little to no work in the systematic understanding of these complex
policies in a contrastive manner, i.e., what minimal changes to the policy
would improve/worsen its performance to a desired level. In this work, we
present COUNTERPOL, the first framework to analyze RL policies using
counterfactual explanations in the form of minimal changes to the policy that
lead to the desired outcome. We do so by incorporating counterfactuals in
supervised learning in RL with the target outcome regulated using desired
return. We establish a theoretical connection between Counterpol and widely
used trust region-based policy optimization methods in RL. Extensive empirical
analysis shows the efficacy of COUNTERPOL in generating explanations for
(un)learning skills while keeping close to the original policy. Our results on
five different RL environments with diverse state and action spaces demonstrate
the utility of counterfactual explanations, paving the way for new frontiers in
designing and developing counterfactual policies.Comment: ICML Workshop on Counterfactuals in Minds and Machines, 202
counterfactuals: An R Package for Counterfactual Explanation Methods
Counterfactual explanation methods provide information on how feature values
of individual observations must be changed to obtain a desired prediction.
Despite the increasing amount of proposed methods in research, only a few
implementations exist whose interfaces and requirements vary widely. In this
work, we introduce the counterfactuals R package, which provides a modular and
unified R6-based interface for counterfactual explanation methods. We
implemented three existing counterfactual explanation methods and propose some
optional methodological extensions to generalize these methods to different
scenarios and to make them more comparable. We explain the structure and
workflow of the package using real use cases and show how to integrate
additional counterfactual explanation methods into the package. In addition, we
compared the implemented methods for a variety of models and datasets with
regard to the quality of their counterfactual explanations and their runtime
behavior
Counterfactual Explanation Generation with s(CASP)
Machine learning models that automate decision-making are increasingly being
used in consequential areas such as loan approvals, pretrial bail, hiring, and
many more. Unfortunately, most of these models are black-boxes, i.e., they are
unable to reveal how they reach these prediction decisions. A need for
transparency demands justification for such predictions. An affected individual
might desire explanations to understand why a decision was made. Ethical and
legal considerations may further require informing the individual of changes in
the input attribute that could be made to produce a desirable outcome. This
paper focuses on the latter problem of automatically generating counterfactual
explanations. Our approach utilizes answer set programming and the s(CASP)
goal-directed ASP system. Answer Set Programming (ASP) is a well-known
knowledge representation and reasoning paradigm. s(CASP) is a goal-directed ASP
system that executes answer-set programs top-down without grounding them. The
query-driven nature of s(CASP) allows us to provide justifications as proof
trees, which makes it possible to analyze the generated counterfactual
explanations. We show how counterfactual explanations are computed and
justified by imagining multiple possible worlds where some or all factual
assumptions are untrue and, more importantly, how we can navigate between these
worlds. We also show how our algorithm can be used to find the Craig
Interpolant for a class of answer set programs for a failing query.Comment: 18 Page
Counterfactual explanation of Bayesian model uncertainty
Artificial intelligence systems are becoming ubiquitous in everyday life as well as in high-risk environments, such as autonomous driving, medical treatment, and medicine. The opaque nature of the deep neural network raises concerns about its adoption in high-risk environments. It is important for researchers to explain how these models reach their decisions. Most of the existing methods rely on softmax to explain model decisions. However, softmax is shown to be often misleading, particularly giving unjustified high confidence even for samples far from the training data. To overcome this shortcoming, we propose Bayesian model uncertainty for producing counterfactual explanations. In this paper, we compare the counterfactual explanation of models based on Bayesian uncertainty and softmax score. This work predictively produces minimal important features, which maximally change classifier output to explain the decision-making process of the Bayesian model. We used MNIST and Caltech Bird 2011 datasets for experiments. The results show that the Bayesian model outperforms the softmax model and produces more concise and human-understandable counterfactuals
Counterfactual Explanation for Fairness in Recommendation
Fairness-aware recommendation eliminates discrimination issues to build
trustworthy recommendation systems.Explaining the causes of unfair
recommendations is critical, as it promotes fairness diagnostics, and thus
secures users' trust in recommendation models. Existing fairness explanation
methods suffer high computation burdens due to the large-scale search space and
the greedy nature of the explanation search process. Besides, they perform
score-based optimizations with continuous values, which are not applicable to
discrete attributes such as gender and race. In this work, we adopt the novel
paradigm of counterfactual explanation from causal inference to explore how
minimal alterations in explanations change model fairness, to abandon the
greedy search for explanations. We use real-world attributes from Heterogeneous
Information Networks (HINs) to empower counterfactual reasoning on discrete
attributes. We propose a novel Counterfactual Explanation for Fairness
(CFairER) that generates attribute-level counterfactual explanations from HINs
for recommendation fairness. Our CFairER conducts off-policy reinforcement
learning to seek high-quality counterfactual explanations, with an attentive
action pruning reducing the search space of candidate counterfactuals. The
counterfactual explanations help to provide rational and proximate explanations
for model fairness, while the attentive action pruning narrows the search space
of attributes. Extensive experiments demonstrate our proposed model can
generate faithful explanations while maintaining favorable recommendation
performance
Counterfactual Explanation of Brain Activity Classifiers Using Image-To-Image Transfer by Generative Adversarial Network
Deep neural networks (DNNs) can accurately decode task-related information from brain activations. However, because of the non-linearity of DNNs, it is generally difficult to explain how and why they assign certain behavioral tasks to given brain activations, either correctly or incorrectly. One of the promising approaches for explaining such a black-box system is counterfactual explanation. In this framework, the behavior of a black-box system is explained by comparing real data and realistic synthetic data that are specifically generated such that the black-box system outputs an unreal outcome. The explanation of the system's decision can be explained by directly comparing the real and synthetic data. Recently, by taking advantage of advances in DNN-based image-to-image translation, several studies successfully applied counterfactual explanation to image domains. In principle, the same approach could be used in functional magnetic resonance imaging (fMRI) data. Because fMRI datasets often contain multiple classes (e.g., multiple behavioral tasks), the image-to-image transformation applicable to counterfactual explanation needs to learn mapping among multiple classes simultaneously. Recently, a new generative neural network (StarGAN) that enables image-to-image transformation among multiple classes has been developed. By adapting StarGAN with some modifications, here, we introduce a novel generative DNN (counterfactual activation generator, CAG) that can provide counterfactual explanations for DNN-based classifiers of brain activations. Importantly, CAG can simultaneously handle image transformation among all the seven classes in a publicly available fMRI dataset. Thus, CAG could provide a counterfactual explanation of DNN-based multiclass classifiers of brain activations. Furthermore, iterative applications of CAG were able to enhance and extract subtle spatial brain activity patterns that affected the classifier's decisions. Together, these results demonstrate that the counterfactual explanation based on image-to-image transformation would be a promising approach to understand and extend the current application of DNNs in fMRI analyses
- …