3,901 research outputs found
A Categorisation of Post-hoc Explanations for Predictive Models
The ubiquity of machine learning based predictive models in modern society
naturally leads people to ask how trustworthy those models are? In predictive
modeling, it is quite common to induce a trade-off between accuracy and
interpretability. For instance, doctors would like to know how effective some
treatment will be for a patient or why the model suggested a particular
medication for a patient exhibiting those symptoms? We acknowledge that the
necessity for interpretability is a consequence of an incomplete formalisation
of the problem, or more precisely of multiple meanings adhered to a particular
concept. For certain problems, it is not enough to get the answer (what), the
model also has to provide an explanation of how it came to that conclusion
(why), because a correct prediction, only partially solves the original
problem. In this article we extend existing categorisation of techniques to aid
model interpretability and test this categorisation.Comment: 5 pages, 3 figures, AAAI 2019 Spring Symposia (#SSS19
VINE: Visualizing Statistical Interactions in Black Box Models
As machine learning becomes more pervasive, there is an urgent need for
interpretable explanations of predictive models. Prior work has developed
effective methods for visualizing global model behavior, as well as generating
local (instance-specific) explanations. However, relatively little work has
addressed regional explanations - how groups of similar instances behave in a
complex model, and the related issue of visualizing statistical feature
interactions. The lack of utilities available for these analytical needs
hinders the development of models that are mission-critical, transparent, and
align with social goals. We present VINE (Visual INteraction Effects), a novel
algorithm to extract and visualize statistical interaction effects in black box
models. We also present a novel evaluation metric for visualizations in the
interpretable ML space
Interpretable Deep Convolutional Neural Networks via Meta-learning
Model interpretability is a requirement in many applications in which crucial
decisions are made by users relying on a model's outputs. The recent movement
for "algorithmic fairness" also stipulates explainability, and therefore
interpretability of learning models. And yet the most successful contemporary
Machine Learning approaches, the Deep Neural Networks, produce models that are
highly non-interpretable. We attempt to address this challenge by proposing a
technique called CNN-INTE to interpret deep Convolutional Neural Networks (CNN)
via meta-learning. In this work, we interpret a specific hidden layer of the
deep CNN model on the MNIST image dataset. We use a clustering algorithm in a
two-level structure to find the meta-level training data and Random Forest as
base learning algorithms to generate the meta-level test data. The
interpretation results are displayed visually via diagrams, which clearly
indicates how a specific test instance is classified. Our method achieves
global interpretation for all the test instances without sacrificing the
accuracy obtained by the original deep CNN model. This means our model is
faithful to the deep CNN model, which leads to reliable interpretations.Comment: 9 pages, 9 figures, 2018 International Joint Conference on Neural
Networks, in pres
Interpreting Adversarial Examples with Attributes
Deep computer vision systems being vulnerable to imperceptible and carefully
crafted noise have raised questions regarding the robustness of their
decisions. We take a step back and approach this problem from an orthogonal
direction. We propose to enable black-box neural networks to justify their
reasoning both for clean and for adversarial examples by leveraging attributes,
i.e. visually discriminative properties of objects. We rank attributes based on
their class relevance, i.e. how the classification decision changes when the
input is visually slightly perturbed, as well as image relevance, i.e. how well
the attributes can be localized on both clean and perturbed images. We present
comprehensive experiments for attribute prediction, adversarial example
generation, adversarially robust learning, and their qualitative and
quantitative analysis using predicted attributes on three benchmark datasets
Explainable Matrix -- Visualization for Global and Local Interpretability of Random Forest Classification Ensembles
Over the past decades, classification models have proven to be essential
machine learning tools given their potential and applicability in various
domains. In these years, the north of the majority of the researchers had been
to improve quantitative metrics, notwithstanding the lack of information about
models' decisions such metrics convey. This paradigm has recently shifted, and
strategies beyond tables and numbers to assist in interpreting models'
decisions are increasing in importance. Part of this trend, visualization
techniques have been extensively used to support classification models'
interpretability, with a significant focus on rule-based models. Despite the
advances, the existing approaches present limitations in terms of visual
scalability, and the visualization of large and complex models, such as the
ones produced by the Random Forest (RF) technique, remains a challenge. In this
paper, we propose Explainable Matrix (ExMatrix), a novel visualization method
for RF interpretability that can handle models with massive quantities of
rules. It employs a simple yet powerful matrix-like visual metaphor, where rows
are rules, columns are features, and cells are rules predicates, enabling the
analysis of entire models and auditing classification results. ExMatrix
applicability is confirmed via different examples, showing how it can be used
in practice to promote RF models interpretability.Comment: IEEE VIS VAST 202
A Workflow for Visual Diagnostics of Binary Classifiers using Instance-Level Explanations
Human-in-the-loop data analysis applications necessitate greater transparency
in machine learning models for experts to understand and trust their decisions.
To this end, we propose a visual analytics workflow to help data scientists and
domain experts explore, diagnose, and understand the decisions made by a binary
classifier. The approach leverages "instance-level explanations", measures of
local feature relevance that explain single instances, and uses them to build a
set of visual representations that guide the users in their investigation. The
workflow is based on three main visual representations and steps: one based on
aggregate statistics to see how data distributes across correct / incorrect
decisions; one based on explanations to understand which features are used to
make these decisions; and one based on raw data, to derive insights on
potential root causes for the observed patterns. The workflow is derived from a
long-term collaboration with a group of machine learning and healthcare
professionals who used our method to make sense of machine learning models they
developed. The case study from this collaboration demonstrates that the
proposed workflow helps experts derive useful knowledge about the model and the
phenomena it describes, thus experts can generate useful hypotheses on how a
model can be improved.Comment: Published at IEEE Conference on Visual Analytics Science and
Technology (IEEE VAST 2017
Choose Your Neuron: Incorporating Domain Knowledge through Neuron-Importance
Individual neurons in convolutional neural networks supervised for
image-level classification tasks have been shown to implicitly learn
semantically meaningful concepts ranging from simple textures and shapes to
whole or partial objects - forming a "dictionary" of concepts acquired through
the learning process. In this work we introduce a simple, efficient zero-shot
learning approach based on this observation. Our approach, which we call Neuron
Importance-AwareWeight Transfer (NIWT), learns to map domain knowledge about
novel "unseen" classes onto this dictionary of learned concepts and then
optimizes for network parameters that can effectively combine these concepts -
essentially learning classifiers by discovering and composing learned semantic
concepts in deep networks. Our approach shows improvements over previous
approaches on the CUBirds and AWA2 generalized zero-shot learning benchmarks.
We demonstrate our approach on a diverse set of semantic inputs as external
domain knowledge including attributes and natural language captions. Moreover
by learning inverse mappings, NIWT can provide visual and textual explanations
for the predictions made by the newly learned classifiers and provide neuron
names. Our code is available at
https://github.com/ramprs/neuron-importance-zsl.Comment: In Proceedings of ECCV 201
Manifold: A Model-Agnostic Framework for Interpretation and Diagnosis of Machine Learning Models
Interpretation and diagnosis of machine learning models have gained renewed
interest in recent years with breakthroughs in new approaches. We present
Manifold, a framework that utilizes visual analysis techniques to support
interpretation, debugging, and comparison of machine learning models in a more
transparent and interactive manner. Conventional techniques usually focus on
visualizing the internal logic of a specific model type (i.e., deep neural
networks), lacking the ability to extend to a more complex scenario where
different model types are integrated. To this end, Manifold is designed as a
generic framework that does not rely on or access the internal logic of the
model and solely observes the input (i.e., instances or features) and the
output (i.e., the predicted result and probability distribution). We describe
the workflow of Manifold as an iterative process consisting of three major
phases that are commonly involved in the model development and diagnosis
process: inspection (hypothesis), explanation (reasoning), and refinement
(verification). The visual components supporting these tasks include a
scatterplot-based visual summary that overviews the models' outcome and a
customizable tabular view that reveals feature discrimination. We demonstrate
current applications of the framework on the classification and regression
tasks and discuss other potential machine learning use scenarios where Manifold
can be applied
MAGIX: Model Agnostic Globally Interpretable Explanations
Explaining the behavior of a black box machine learning model at the instance
level is useful for building trust. However, it is also important to understand
how the model behaves globally. Such an understanding provides insight into
both the data on which the model was trained and the patterns that it learned.
We present here an approach that learns if-then rules to globally explain the
behavior of black box machine learning models that have been used to solve
classification problems. The approach works by first extracting conditions that
were important at the instance level and then evolving rules through a genetic
algorithm with an appropriate fitness function. Collectively, these rules
represent the patterns followed by the model for decisioning and are useful for
understanding its behavior. We demonstrate the validity and usefulness of the
approach by interpreting black box models created using publicly available data
sets as well as a private digital marketing data set
Explainability in Human-Agent Systems
This paper presents a taxonomy of explainability in Human-Agent Systems. We
consider fundamental questions about the Why, Who, What, When and How of
explainability. First, we define explainability, and its relationship to the
related terms of interpretability, transparency, explicitness, and
faithfulness. These definitions allow us to answer why explainability is needed
in the system, whom it is geared to and what explanations can be generated to
meet this need. We then consider when the user should be presented with this
information. Last, we consider how objective and subjective measures can be
used to evaluate the entire system. This last question is the most encompassing
as it will need to evaluate all other issues regarding explainability
- …