3 research outputs found
Theoretical Behavior of XAI Methods in the Presence of Suppressor Variables
In recent years, the community of 'explainable artificial intelligence' (XAI)
has created a vast body of methods to bridge a perceived gap between model
'complexity' and 'interpretability'. However, a concrete problem to be solved
by XAI methods has not yet been formally stated. As a result, XAI methods are
lacking theoretical and empirical evidence for the 'correctness' of their
explanations, limiting their potential use for quality-control and transparency
purposes. At the same time, Haufe et al. (2014) showed, using simple toy
examples, that even standard interpretations of linear models can be highly
misleading. Specifically, high importance may be attributed to so-called
suppressor variables lacking any statistical relation to the prediction target.
This behavior has been confirmed empirically for a large array of XAI methods
in Wilming et al. (2022). Here, we go one step further by deriving analytical
expressions for the behavior of a variety of popular XAI methods on a simple
two-dimensional binary classification problem involving Gaussian
class-conditional distributions. We show that the majority of the studied
approaches will attribute non-zero importance to a non-class-related suppressor
feature in the presence of correlated noise. This poses important limitations
on the interpretations and conclusions that the outputs of these XAI methods
can afford.Comment: Accepted at ICML 202