110,256 research outputs found
Towards A Rigorous Science of Interpretable Machine Learning
As machine learning systems become ubiquitous, there has been a surge of
interest in interpretable machine learning: systems that provide explanation
for their outputs. These explanations are often used to qualitatively assess
other criteria such as safety or non-discrimination. However, despite the
interest in interpretability, there is very little consensus on what
interpretable machine learning is and how it should be measured. In this
position paper, we first define interpretability and describe when
interpretability is needed (and when it is not). Next, we suggest a taxonomy
for rigorous evaluation and expose open questions towards a more rigorous
science of interpretable machine learning
Interpretable Reinforcement Learning with Ensemble Methods
We propose to use boosted regression trees as a way to compute
human-interpretable solutions to reinforcement learning problems. Boosting
combines several regression trees to improve their accuracy without
significantly reducing their inherent interpretability. Prior work has focused
independently on reinforcement learning and on interpretable machine learning,
but there has been little progress in interpretable reinforcement learning. Our
experimental results show that boosted regression trees compute solutions that
are both interpretable and match the quality of leading reinforcement learning
methods
Techniques for Interpretable Machine Learning
Interpretable machine learning tackles the important problem that humans
cannot understand the behaviors of complex machine learning models and how
these models arrive at a particular decision. Although many approaches have
been proposed, a comprehensive understanding of the achievements and challenges
is still lacking. We provide a survey covering existing techniques to increase
the interpretability of machine learning models. We also discuss crucial issues
that the community should consider in future work such as designing
user-friendly explanations and developing comprehensive evaluation metrics to
further push forward the area of interpretable machine learning.Comment: Accepted by Communications of the ACM (CACM), Review Articl
Model-Agnostic Interpretability of Machine Learning
Understanding why machine learning models behave the way they do empowers
both system designers and end-users in many ways: in model selection, feature
engineering, in order to trust and act upon the predictions, and in more
intuitive user interfaces. Thus, interpretability has become a vital concern in
machine learning, and work in the area of interpretable models has found
renewed interest. In some applications, such models are as accurate as
non-interpretable ones, and thus are preferred for their transparency. Even
when they are not accurate, they may still be preferred when interpretability
is of paramount importance. However, restricting machine learning to
interpretable models is often a severe limitation. In this paper we argue for
explaining machine learning predictions using model-agnostic approaches. By
treating the machine learning models as black-box functions, these approaches
provide crucial flexibility in the choice of models, explanations, and
representations, improving debugging, comparison, and interfaces for a variety
of users and models. We also outline the main challenges for such methods, and
review a recently-introduced model-agnostic explanation approach (LIME) that
addresses these challenges.Comment: presented at 2016 ICML Workshop on Human Interpretability in Machine
Learning (WHI 2016), New York, N
Meaningful Models: Utilizing Conceptual Structure to Improve Machine Learning Interpretability
The last decade has seen huge progress in the development of advanced machine
learning models; however, those models are powerless unless human users can
interpret them. Here we show how the mind's construction of concepts and
meaning can be used to create more interpretable machine learning models. By
proposing a novel method of classifying concepts, in terms of 'form' and
'function', we elucidate the nature of meaning and offer proposals to improve
model understandability. As machine learning begins to permeate daily life,
interpretable models may serve as a bridge between domain-expert authors and
non-expert users.Comment: 5 pages, 3 figures, presented at 2016 ICML Workshop on Human
Interpretability in Machine Learning (WHI 2016), New York, N
Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead
Black box machine learning models are currently being used for high stakes
decision-making throughout society, causing problems throughout healthcare,
criminal justice, and in other domains. People have hoped that creating methods
for explaining these black box models will alleviate some of these problems,
but trying to \textit{explain} black box models, rather than creating models
that are \textit{interpretable} in the first place, is likely to perpetuate bad
practices and can potentially cause catastrophic harm to society. There is a
way forward -- it is to design models that are inherently interpretable. This
manuscript clarifies the chasm between explaining black boxes and using
inherently interpretable models, outlines several key reasons why explainable
black boxes should be avoided in high-stakes decisions, identifies challenges
to interpretable machine learning, and provides several example applications
where interpretable models could potentially replace black box models in
criminal justice, healthcare, and computer vision.Comment: Author's pre-publication version of a 2019 Nature Machine
Intelligence article. Shorter Version was published in NIPS 2018 Workshop on
Critiquing and Correcting Trends in Machine Learning. Expands also on NSF
Statistics at a Crossroads Webina
Interpretability via Model Extraction
The ability to interpret machine learning models has become increasingly
important now that machine learning is used to inform consequential decisions.
We propose an approach called model extraction for interpreting complex,
blackbox models. Our approach approximates the complex model using a much more
interpretable model; as long as the approximation quality is good, then
statistical properties of the complex model are reflected in the interpretable
model. We show how model extraction can be used to understand and debug random
forests and neural nets trained on several datasets from the UCI Machine
Learning Repository, as well as control policies learned for several classical
reinforcement learning problems.Comment: Presented as a poster at the 2017 Workshop on Fairness,
Accountability, and Transparency in Machine Learning (FAT/ML 2017
The Doctor Just Won't Accept That!
Calls to arms to build interpretable models express a well-founded discomfort
with machine learning. Should a software agent that does not even know what a
loan is decide who qualifies for one? Indeed, we ought to be cautious about
injecting machine learning (or anything else, for that matter) into
applications where there may be a significant risk of causing social harm.
However, claims that stakeholders "just won't accept that!" do not provide a
sufficient foundation for a proposed field of study. For the field of
interpretable machine learning to advance, we must ask the following questions:
What precisely won't various stakeholders accept? What do they want? Are these
desiderata reasonable? Are they feasible? In order to answer these questions,
we'll have to give real-world problems and their respective stakeholders
greater consideration.Comment: Presented at NIPS 2017 Interpretable ML Symposiu
Explaining a black-box using Deep Variational Information Bottleneck Approach
Interpretable machine learning has gained much attention recently. Briefness
and comprehensiveness are necessary in order to provide a large amount of
information concisely when explaining a black-box decision system. However,
existing interpretable machine learning methods fail to consider briefness and
comprehensiveness simultaneously, leading to redundant explanations. We propose
the variational information bottleneck for interpretation, VIBI, a
system-agnostic interpretable method that provides a brief but comprehensive
explanation. VIBI adopts an information theoretic principle, information
bottleneck principle, as a criterion for finding such explanations. For each
instance, VIBI selects key features that are maximally compressed about an
input (briefness), and informative about a decision made by a black-box system
on that input (comprehensive). We evaluate VIBI on three datasets and compare
with state-of-the-art interpretable machine learning methods in terms of both
interpretability and fidelity evaluated by human and quantitative metric
Explaining Transition Systems through Program Induction
Explaining and reasoning about processes which underlie observed black-box
phenomena enables the discovery of causal mechanisms, derivation of suitable
abstract representations and the formulation of more robust predictions. We
propose to learn high level functional programs in order to represent abstract
models which capture the invariant structure in the observed data. We introduce
the -machine (program-induction machine) -- an architecture able to induce
interpretable LISP-like programs from observed data traces. We propose an
optimisation procedure for program learning based on backpropagation, gradient
descent and A* search. We apply the proposed method to three problems: system
identification of dynamical systems, explaining the behaviour of a DQN agent
and learning by demonstration in a human-robot interaction scenario. Our
experimental results show that the -machine can efficiently induce
interpretable programs from individual data traces.Comment: submitted to Neural Information Processing Systems 201
- …
