Search CORE

368 research outputs found

Recommended from our members

Measurable counterfactual local explanations for any classifier

Author: d'Avila Garcez A. S.
White A.
Publication venue: 'IOS Press'
Publication date: 24/08/2020
Field of study

We propose a novel method for explaining the predictions of any classifier. In our approach, local explanations are expected to explain both the outcome of a prediction and how that prediction would change if athings had been different'. Furthermore, we argue that satisfactory explanations cannot be dissociated from a notion and measure of fidelity, as advocated in the early days of neural networks' knowledge extraction. We introduce a definition of fidelity to the underlying classifier for local explanation models which is based on distances to a target decision boundary. A system called CLEAR: Counterfactual Local Explanations via Regression, is introduced and evaluated. CLEAR generates b-counterfactual explanations that state minimum changes necessary to flip a prediction's classification. CLEAR then builds local regression models, using the b-counterfactuals to measure and improve the fidelity of its regressions. By contrast, the popular LIME method [17], which also uses regression to generate local explanations, neither measures its own fidelity nor generates counterfactuals. CLEAR's regressions are found to have significantly higher fidelity than LIME's, averaging over 40% higher in this paper's five case studies

City Research Online

Actionable Recourse in Linear Classification

Author: Biran Or
Chouldechova Alexandra
Citron Danielle Keats
Crawford Kate
Edwards Lilian
Foundation Deutschland Open Knowledge
Poulin Brett
Ribeiro Marco Tulio
Spangher Alexander
Taylor Winnie F
Tramèr Florian
United
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 08/11/2019
Field of study

Machine learning models are increasingly used to automate decisions that affect humans - deciding who should receive a loan, a job interview, or a social service. In such applications, a person should have the ability to change the decision of a model. When a person is denied a loan by a credit score, for example, they should be able to alter its input variables in a way that guarantees approval. Otherwise, they will be denied the loan as long as the model is deployed. More importantly, they will lack the ability to influence a decision that affects their livelihood. In this paper, we frame these issues in terms of recourse, which we define as the ability of a person to change the decision of a model by altering actionable input variables (e.g., income vs. age or marital status). We present integer programming tools to ensure recourse in linear classification problems without interfering in model development. We demonstrate how our tools can inform stakeholders through experiments on credit scoring problems. Our results show that recourse can be significantly affected by standard practices in model development, and motivate the need to evaluate recourse in practice.Comment: Extended version. ACM Conference on Fairness, Accountability and Transparency [FAT2019

arXiv.org e-Print Archive

Crossref

Right for the Wrong Reason: Can Interpretable ML Techniques Detect Spurious Correlations?

Author: Baumgartner Christian F.
Koch Lisa M.
Sun Susu
Publication venue
Publication date: 23/07/2023
Field of study

While deep neural network models offer unmatched classification performance, they are prone to learning spurious correlations in the data. Such dependencies on confounding information can be difficult to detect using performance metrics if the test data comes from the same distribution as the training data. Interpretable ML methods such as post-hoc explanations or inherently interpretable classifiers promise to identify faulty model reasoning. However, there is mixed evidence whether many of these techniques are actually able to do so. In this paper, we propose a rigorous evaluation strategy to assess an explanation technique's ability to correctly identify spurious correlations. Using this strategy, we evaluate five post-hoc explanation techniques and one inherently interpretable method for their ability to detect three types of artificially added confounders in a chest x-ray diagnosis task. We find that the post-hoc technique SHAP, as well as the inherently interpretable Attri-Net provide the best performance and can be used to reliably identify faulty model behavior

arXiv.org e-Print Archive

Inherently Interpretable Multi-Label Classification Using Class-Specific Counterfactuals

Author: Baumgartner Christian F.
Koch Lisa M.
Maier Andreas
Sun Susu
Woerner Stefano
Publication venue
Publication date: 08/08/2023
Field of study

Interpretability is essential for machine learning algorithms in high-stakes application fields such as medical image analysis. However, high-performing black-box neural networks do not provide explanations for their predictions, which can lead to mistrust and suboptimal human-ML collaboration. Post-hoc explanation techniques, which are widely used in practice, have been shown to suffer from severe conceptual problems. Furthermore, as we show in this paper, current explanation techniques do not perform adequately in the multi-label scenario, in which multiple medical findings may co-occur in a single image. We propose Attri-Net, an inherently interpretable model for multi-label classification. Attri-Net is a powerful classifier that provides transparent, trustworthy, and human-understandable explanations. The model first generates class-specific attribution maps based on counterfactuals to identify which image regions correspond to certain medical findings. Then a simple logistic regression classifier is used to make predictions based solely on these attribution maps. We compare Attri-Net to five post-hoc explanation techniques and one inherently interpretable classifier on three chest X-ray datasets. We find that Attri-Net produces high-quality multi-label explanations consistent with clinical knowledge and has comparable classification performance to state-of-the-art classification models.Comment: Accepted to MIDL 202

arXiv.org e-Print Archive

On the Rationality of Explanations in Classification Algorithms

Author: Costa Vicent
Falomir Zoe
Publication venue: 'IOS Press'
Publication date: 01/01/2021
Field of study

This paper is a first step towards studying the rationality of explanations produced by up-to-date AI systems. Based on the thesis that designing rational explanations for accomplishing trustworthy AI is fundamental for ethics in AI, we study the rationality criteria that explanations in classification algorithms have to meet. In this way, we identify, define, and exemplify characteristic criteria of rational explanations in classification algorithms

Repositori Institucional de la Universitat Jaume I

To trust or not to trust an explanation: using LEAF to evaluate local linear XAI methods

Author: Amparore Elvio Gilberto
Bajardi Paolo
Perotti Alan
Publication venue
Publication date: 01/01/2021
Field of study

The main objective of eXplainable Artificial Intelligence (XAI) is to provide effective explanations for black-box classifiers. The existing literature lists many desirable properties for explanations to be useful, but there is no consensus on how to quantitatively evaluate explanations in practice. Moreover, explanations are typically used only to inspect black-box models, and the proactive use of explanations as a decision support is generally overlooked. Among the many approaches to XAI, a widely adopted paradigm is Local Linear Explanations - with LIME and SHAP emerging as state-of-the-art methods. We show that these methods are plagued by many defects including unstable explanations, divergence of actual implementations from the promised theoretical properties, and explanations for the wrong label. This highlights the need to have standard and unbiased evaluation procedures for Local Linear Explanations in the XAI field. In this paper we address the problem of identifying a clear and unambiguous set of metrics for the evaluation of Local Linear Explanations. This set includes both existing and novel metrics defined specifically for this class of explanations. All metrics have been included in an open Python framework, named LEAF. The purpose of LEAF is to provide a reference for end users to evaluate explanations in a standardised and unbiased way, and to guide researchers towards developing improved explainable techniques.Comment: 16 pages, 8 figure

arXiv.org e-Print Archive

Institutional Research Information System University of Turin