28 research outputs found
Sampling, Intervention, Prediction, Aggregation: A Generalized Framework for Model-Agnostic Interpretations
Model-agnostic interpretation techniques allow us to explain the behavior of
any predictive model. Due to different notations and terminology, it is
difficult to see how they are related. A unified view on these methods has been
missing. We present the generalized SIPA (sampling, intervention, prediction,
aggregation) framework of work stages for model-agnostic interpretations and
demonstrate how several prominent methods for feature effects can be embedded
into the proposed framework. Furthermore, we extend the framework to feature
importance computations by pointing out how variance-based and
performance-based importance measures are based on the same work stages. The
SIPA framework reduces the diverse set of model-agnostic techniques to a single
methodology and establishes a common terminology to discuss them in future
work
Visualizing the Feature Importance for Black Box Models
In recent years, a large amount of model-agnostic methods to improve the
transparency, trustability and interpretability of machine learning models have
been developed. We introduce local feature importance as a local version of a
recent model-agnostic global feature importance method. Based on local feature
importance, we propose two visual tools: partial importance (PI) and individual
conditional importance (ICI) plots which visualize how changes in a feature
affect the model performance on average, as well as for individual
observations. Our proposed methods are related to partial dependence (PD) and
individual conditional expectation (ICE) plots, but visualize the expected
(conditional) feature importance instead of the expected (conditional)
prediction. Furthermore, we show that averaging ICI curves across observations
yields a PI curve, and integrating the PI curve with respect to the
distribution of the considered feature results in the global feature
importance. Another contribution of our paper is the Shapley feature
importance, which fairly distributes the overall performance of a model among
the features according to the marginal contributions and which can be used to
compare the feature importance across different models.Comment: To Appear in Machine Learning and Knowledge Discovery in Databases:
European Conference, ECML PKDD 2018, Dublin, Ireland, September 10 to 14,
2018, Proceedings, Part
Inverse Classification for Comparison-based Interpretability in Machine Learning
In the context of post-hoc interpretability, this paper addresses the task of
explaining the prediction of a classifier, considering the case where no
information is available, neither on the classifier itself, nor on the
processed data (neither the training nor the test data). It proposes an
instance-based approach whose principle consists in determining the minimal
changes needed to alter a prediction: given a data point whose classification
must be explained, the proposed method consists in identifying a close
neighbour classified differently, where the closeness definition integrates a
sparsity constraint. This principle is implemented using observation generation
in the Growing Spheres algorithm. Experimental results on two datasets
illustrate the relevance of the proposed approach that can be used to gain
knowledge about the classifier.Comment: preprin
Local Interpretation Methods to Machine Learning Using the Domain of the Feature Space
As machine learning becomes an important part of many real world applications
affecting human lives, new requirements, besides high predictive accuracy,
become important. One important requirement is transparency, which has been
associated with model interpretability. Many machine learning algorithms induce
models difficult to interpret, named black box. Moreover, people have
difficulty to trust models that cannot be explained. In particular for machine
learning, many groups are investigating new methods able to explain black box
models. These methods usually look inside the black models to explain their
inner work. By doing so, they allow the interpretation of the decision making
process used by black box models. Among the recently proposed model
interpretation methods, there is a group, named local estimators, which are
designed to explain how the label of particular instance is predicted. For
such, they induce interpretable models on the neighborhood of the instance to
be explained. Local estimators have been successfully used to explain specific
predictions. Although they provide some degree of model interpretability, it is
still not clear what is the best way to implement and apply them. Open
questions include: how to best define the neighborhood of an instance? How to
control the trade-off between the accuracy of the interpretation method and its
interpretability? How to make the obtained solution robust to small variations
on the instance to be explained? To answer to these questions, we propose and
investigate two strategies: (i) using data instance properties to provide
improved explanations, and (ii) making sure that the neighborhood of an
instance is properly defined by taking the geometry of the domain of the
feature space into account. We evaluate these strategies in a regression task
and present experimental results that show that they can improve local
explanations
Interpretation of microbiota-based diagnostics by explaining individual classifier decisions
Transparency: Motivations and Challenges
Transparency is often deemed critical to enable effective real- world deployment of intelligent systems. Yet the motivations for and benefits of different types of transparency can vary significantly depending on context, and objective measurement criteria are difficult to identify. We provide a brief survey, suggesting challenges and related concerns, particularly when agents have misaligned interests. We highlight and review settings where transparency may cause harm, discussing connections across privacy, multi-agent game theory, econo- mics, fairness and trust