33 research outputs found
LIMEtree: Interactively Customisable Explanations Based on Local Surrogate Multi-output Regression Trees
Systems based on artificial intelligence and machine learning models should
be transparent, in the sense of being capable of explaining their decisions to
gain humans' approval and trust. While there are a number of explainability
techniques that can be used to this end, many of them are only capable of
outputting a single one-size-fits-all explanation that simply cannot address
all of the explainees' diverse needs. In this work we introduce a
model-agnostic and post-hoc local explainability technique for black-box
predictions called LIMEtree, which employs surrogate multi-output regression
trees. We validate our algorithm on a deep neural network trained for object
detection in images and compare it against Local Interpretable Model-agnostic
Explanations (LIME). Our method comes with local fidelity guarantees and can
produce a range of diverse explanation types, including contrastive and
counterfactual explanations praised in the literature. Some of these
explanations can be interactively personalised to create bespoke, meaningful
and actionable insights into the model's behaviour. While other methods may
give an illusion of customisability by wrapping, otherwise static, explanations
in an interactive interface, our explanations are truly interactive, in the
sense of allowing the user to "interrogate" a black-box model. LIMEtree can
therefore produce consistent explanations on which an interactive exploratory
process can be built
LIMEtree:Interactively Customisable Explanations Based on Local Surrogate Multi-output Regression Trees
One Explanation Does Not Fit All The Promise of Interactive Explanations for Machine Learning Transparency
The need for transparency of predictive systems based on Machine Learning
algorithms arises as a consequence of their ever-increasing proliferation in
the industry. Whenever black-box algorithmic predictions influence human
affairs, the inner workings of these algorithms should be scrutinised and their
decisions explained to the relevant stakeholders, including the system
engineers, the system's operators and the individuals whose case is being
decided. While a variety of interpretability and explainability methods is
available, none of them is a panacea that can satisfy all diverse expectations
and competing objectives that might be required by the parties involved. We
address this challenge in this paper by discussing the promises of Interactive
Machine Learning for improved transparency of black-box systems using the
example of contrastive explanations -- a state-of-the-art approach to
Interpretable Machine Learning.
Specifically, we show how to personalise counterfactual explanations by
interactively adjusting their conditional statements and extract additional
explanations by asking follow-up "What if?" questions. Our experience in
building, deploying and presenting this type of system allowed us to list
desired properties as well as potential limitations, which can be used to guide
the development of interactive explainers. While customising the medium of
interaction, i.e., the user interface comprising of various communication
channels, may give an impression of personalisation, we argue that adjusting
the explanation itself and its content is more important. To this end,
properties such as breadth, scope, context, purpose and target of the
explanation have to be considered, in addition to explicitly informing the
explainee about its limitations and caveats...Comment: Published in the Kunstliche Intelligenz journal, special issue on
Challenges in Interactive Machine Learnin
Towards Faithful and Meaningful Interpretable Representations
Interpretable representations are the backbone of many black-box explainers. They translate the low-level data representation necessary for good predictive performance into high-level human-intelligible concepts used to convey the explanation. Notably, the explanation type and its cognitive complexity are directly controlled by the interpretable representation, allowing to target a particular audience and use case. However, many explainers that rely on interpretable representations overlook their merit and fall back on default solutions, which may introduce implicit assumptions, thereby degrading the explanatory power of such techniques. To address this problem, we study properties of interpretable representations that encode presence and absence of human-comprehensible concepts. We show how they are operationalised for tabular, image and text data, discussing their strengths and weaknesses. Finally, we analyse their explanatory properties in the context of tabular data, where a linear model is used to quantify the importance of interpretable concepts
(Un)reasonable Allure of Ante-hoc Interpretability for High-stakes Domains: Transparency Is Necessary but Insufficient for Explainability
Ante-hoc interpretability has become the holy grail of explainable machine
learning for high-stakes domains such as healthcare; however, this notion is
elusive, lacks a widely-accepted definition and depends on the deployment
context. It can refer to predictive models whose structure adheres to
domain-specific constraints, or ones that are inherently transparent. The
latter notion assumes observers who judge this quality, whereas the former
presupposes them to have technical and domain expertise, in certain cases
rendering such models unintelligible. Additionally, its distinction from the
less desirable post-hoc explainability, which refers to methods that construct
a separate explanatory model, is vague given that transparent predictors may
still require (post-)processing to yield satisfactory explanatory insights.
Ante-hoc interpretability is thus an overloaded concept that comprises a range
of implicit properties, which we unpack in this paper to better understand what
is needed for its safe deployment across high-stakes domains. To this end, we
outline model- and explainer-specific desiderata that allow us to navigate its
distinct realisations in view of the envisaged application and audience
Navigating Explanatory Multiverse Through Counterfactual Path Geometry
Counterfactual explanations are the de facto standard when tasked with
interpreting decisions of (opaque) predictive models. Their generation is often
subject to algorithmic and domain-specific constraints -- such as density-based
feasibility, and attribute (im)mutability or directionality of change -- that
aim to maximise their real-life utility. In addition to desiderata with respect
to the counterfactual instance itself, existence of a viable path connecting it
with the factual data point, known as algorithmic recourse, has become an
important technical consideration. While both of these requirements ensure that
the steps of the journey as well as its destination are admissible, current
literature neglects the multiplicity of such counterfactual paths. To address
this shortcoming we introduce the novel concept of explanatory multiverse that
encompasses all the possible counterfactual journeys. We then show how to
navigate, reason about and compare the geometry of these trajectories with two
methods: vector spaces and graphs. To this end, we overview their spacial
properties -- such as affinity, branching, divergence and possible future
convergence -- and propose an all-in-one metric, called opportunity potential,
to quantify them. Implementing this (possibly interactive) explanatory process
grants explainees agency by allowing them to select counterfactuals based on
the properties of the journey leading to them in addition to their absolute
differences. We show the flexibility, benefit and efficacy of such an approach
through examples and quantitative evaluation on the German Credit and MNIST
data sets.Comment: Workshop on Counterfactuals in Minds and Machines at 2023
International Conference on Machine Learning (ICML