Search CORE

3,149 research outputs found

Problems with Shapley-value-based explanations as feature importance measures

Author: Friedler Sorelle
Kumar I. Elizabeth
Scheidegger Carlos
Venkatasubramanian Suresh
Publication venue
Publication date: 01/01/2020
Field of study

Game-theoretic formulations of feature importance have become popular as a way to "explain" machine learning models. These methods define a cooperative game between the features of a model and distribute influence among these input elements using some form of the game's unique Shapley values. Justification for these methods rests on two pillars: their desirable mathematical properties, and their applicability to specific motivations for explanations. We show that mathematical problems arise when Shapley values are used for feature importance and that the solutions to mitigate these necessarily induce further complexity, such as the need for causal reasoning. We also draw on additional literature to argue that Shapley values do not provide explanations which suit human-centric goals of explainability.Comment: Accepted to ICML 202

arXiv.org e-Print Archive

Haverford College: Haverford Scholarship

Explaining Data-Driven Decisions made by AI Systems: The Counterfactual Approach

Author: Fernández-Loría Carlos
Han Xintian
Provost Foster
Publication venue
Publication date: 13/10/2021
Field of study

We examine counterfactual explanations for explaining the decisions made by model-based AI systems. The counterfactual approach we consider defines an explanation as a set of the system's data inputs that causally drives the decision (i.e., changing the inputs in the set changes the decision) and is irreducible (i.e., changing any subset of the inputs does not change the decision). We (1) demonstrate how this framework may be used to provide explanations for decisions made by general, data-driven AI systems that may incorporate features with arbitrary data types and multiple predictive models, and (2) propose a heuristic procedure to find the most useful explanations depending on the context. We then contrast counterfactual explanations with methods that explain model predictions by weighting features according to their importance (e.g., SHAP, LIME) and present two fundamental reasons why we should carefully consider whether importance-weight explanations are well-suited to explain system decisions. Specifically, we show that (i) features that have a large importance weight for a model prediction may not affect the corresponding decision, and (ii) importance weights are insufficient to communicate whether and how features influence decisions. We demonstrate this with several concise examples and three detailed case studies that compare the counterfactual approach with SHAP to illustrate various conditions under which counterfactual explanations explain data-driven decisions better than importance weights

arXiv.org e-Print Archive

AIS Electronic Library (AISeL)

The Grammar of Interactive Explanatory Model Analysis

Author: Baniecki Hubert
Biecek Przemyslaw
Publication venue
Publication date: 28/03/2021
Field of study

The growing need for in-depth analysis of predictive models leads to a series of new methods for explaining their local and global properties. Which of these methods is the best? It turns out that this is an ill-posed question. One cannot sufficiently explain a black-box machine learning model using a single method that gives only one perspective. Isolated explanations are prone to misunderstanding, which inevitably leads to wrong or simplistic reasoning. This problem is known as the Rashomon effect and refers to diverse, even contradictory interpretations of the same phenomenon. Surprisingly, the majority of methods developed for explainable machine learning focus on a single aspect of the model behavior. In contrast, we showcase the problem of explainability as an interactive and sequential analysis of a model. This paper presents how different Explanatory Model Analysis (EMA) methods complement each other and why it is essential to juxtapose them together. The introduced process of Interactive EMA (IEMA) derives from the algorithmic side of explainable machine learning and aims to embrace ideas developed in cognitive sciences. We formalize the grammar of IEMA to describe potential human-model dialogues. IEMA is implemented in the human-centered framework that adopts interactivity, customizability and automation as its main traits. Combined, these methods enhance the responsible approach to predictive modeling.Comment: 17 pages, 10 figures, 3 table

arXiv.org e-Print Archive