10 research outputs found
Counterfactuals and Causability in Explainable Artificial Intelligence: Theory, Algorithms, and Applications
There has been a growing interest in model-agnostic methods that can make
deep learning models more transparent and explainable to a user. Some
researchers recently argued that for a machine to achieve a certain degree of
human-level explainability, this machine needs to provide human causally
understandable explanations, also known as causability. A specific class of
algorithms that have the potential to provide causability are counterfactuals.
This paper presents an in-depth systematic review of the diverse existing body
of literature on counterfactuals and causability for explainable artificial
intelligence. We performed an LDA topic modelling analysis under a PRISMA
framework to find the most relevant literature articles. This analysis resulted
in a novel taxonomy that considers the grounding theories of the surveyed
algorithms, together with their underlying properties and applications in
real-world data. This research suggests that current model-agnostic
counterfactual algorithms for explainable AI are not grounded on a causal
theoretical formalism and, consequently, cannot promote causability to a human
decision-maker. Our findings suggest that the explanations derived from major
algorithms in the literature provide spurious correlations rather than
cause/effects relationships, leading to sub-optimal, erroneous or even biased
explanations. This paper also advances the literature with new directions and
challenges on promoting causability in model-agnostic approaches for
explainable artificial intelligence
Flexible and Robust Counterfactual Explanations with Minimal Satisfiable Perturbations
Counterfactual explanations (CFEs) exemplify how to minimally modify a
feature vector to achieve a different prediction for an instance. CFEs can
enhance informational fairness and trustworthiness, and provide suggestions for
users who receive adverse predictions. However, recent research has shown that
multiple CFEs can be offered for the same instance or instances with slight
differences. Multiple CFEs provide flexible choices and cover diverse
desiderata for user selection. However, individual fairness and model
reliability will be damaged if unstable CFEs with different costs are returned.
Existing methods fail to exploit flexibility and address the concerns of
non-robustness simultaneously. To address these issues, we propose a
conceptually simple yet effective solution named Counterfactual Explanations
with Minimal Satisfiable Perturbations (CEMSP). Specifically, CEMSP constrains
changing values of abnormal features with the help of their semantically
meaningful normal ranges. For efficiency, we model the problem as a Boolean
satisfiability problem to modify as few features as possible. Additionally,
CEMSP is a general framework and can easily accommodate more practical
requirements, e.g., casualty and actionability. Compared to existing methods,
we conduct comprehensive experiments on both synthetic and real-world datasets
to demonstrate that our method provides more robust explanations while
preserving flexibility.Comment: Accepted by CIKM 202
An empirical study on how humans appreciate automated counterfactual explanations which embrace imprecise information
The explanatory capacity of interpretable fuzzy rule-based classifiers is usually limited to offering explanations for the predicted class only. A lack of potentially useful explanations for non-predicted alternatives can be overcome by designing methods for the so-called counterfactual reasoning. Nevertheless, state-of-the-art methods for counterfactual explanation generation require special attention to human evaluation aspects, as the final decision upon the classification under consideration is left for the end user. In this paper, we first introduce novel methods for qualitative and quantitative counterfactual explanation generation. Then, we carry out a comparative analysis of qualitative explanation generation methods operating on (combinations of) linguistic terms as well as a quantitative method suggesting precise changes in feature values. Then, we propose a new metric for assessing the perceived complexity of the generated explanations. Further, we design and carry out two human evaluation experiments to assess the explanatory power of the aforementioned methods. As a major result, we show that the estimated explanation complexity correlates well with the informativeness, relevance, and readability of explanations perceived by the targeted study participants. This fact opens the door to using the new automatic complexity metric for guiding multi-objective evolutionary explainable fuzzy modeling in the near futureIlia Stepin is an FPI researcher (grant PRE2019-090153). Jose M. Alonso-Moral is a Ramon y Cajal researcher (grant RYC-2016–19802). This work was supported by the Spanish Ministry of Science and Innovation (grants RTI2018-099646-B-I00, PID2021-123152OB-C21, and TED2021-130295B-C33) and the Galician Ministry of Culture, Education, Professional Training and University (grants ED431F2018/02, ED431G2019/04, and ED431C2022/19). All the grants were co-funded by the European Regional Development Fund (ERDF/FEDER program).S
On the Generation of Realistic and Robust Counterfactual Explanations for Algorithmic Recourse
This recent widespread deployment of machine learning algorithms presents many new challenges. Machine learning algorithms are usually opaque and can be particularly difficult to interpret. When humans are involved, algorithmic and automated decisions can negatively impact people’s lives. Therefore, end users would like to be insured against potential harm. One popular way to achieve this is to provide end users access to algorithmic recourse, which gives end users negatively affected by algorithmic decisions the opportunity to reverse unfavorable decisions, e.g., from a loan denial to a loan acceptance. In this thesis, we design recourse algorithms to meet various end user needs. First, we propose methods for the generation of realistic recourses. We use generative models to suggest recourses likely to occur under the data distribution. To this end, we shift the recourse action from the input space to the generative model’s latent space, allowing to generate counterfactuals that lie in regions with data support. Second, we observe that small changes applied to the recourses prescribed to end users likely invalidate the suggested recourse after being nosily implemented in practice. Motivated by this observation, we design methods for the generation of robust recourses and for assessing the robustness of recourse algorithms to data deletion requests. Third, the lack of a commonly used code-base for counterfactual explanation and algorithmic recourse algorithms and the vast array of evaluation measures in literature make it difficult to compare the per formance of different algorithms. To solve this problem, we provide an open source benchmarking library that streamlines the evaluation process and can be used for benchmarking, rapidly developing new methods, and setting up new
experiments. In summary, our work contributes to a more reliable interaction of end users and machine learned models by covering fundamental aspects of the recourse process and suggests new solutions towards generating realistic and robust counterfactual explanations for algorithmic recourse