1 research outputs found
How good Neural Networks interpretation methods really are? A quantitative benchmark
Saliency Maps (SMs) have been extensively used to interpret deep learning
models decision by highlighting the features deemed relevant by the model. They
are used on highly nonlinear problems, where linear feature selection (FS)
methods fail at highlighting relevant explanatory variables. However, the
reliability of gradient-based feature attribution methods such as SM has mostly
been only qualitatively (visually) assessed, and quantitative benchmarks are
currently missing, partially due to the lack of a definite ground truth on
image data. Concerned about the apophenic biases introduced by visual
assessment of these methods, in this paper we propose a synthetic quantitative
benchmark for Neural Networks (NNs) interpretation methods. For this purpose,
we built synthetic datasets with nonlinearly separable classes and increasing
number of decoy (random) features, illustrating the challenge of FS in
high-dimensional settings. We also compare these methods to conventional
approaches such as mRMR or Random Forests. Our results show that our simple
synthetic datasets are sufficient to challenge most of the benchmarked methods.
TreeShap, mRMR and LassoNet are the best performing FS methods. We also show
that, when quantifying the relevance of a few non linearly-entangled predictive
features diluted in a large number of irrelevant noisy variables, neural
network-based FS and interpretation methods are still far from being reliable