In this paper two new learning-based eXplainable AI (XAI) methods for deep
convolutional neural network (DCNN) image classifiers, called L-CAM-Fm and
L-CAM-Img, are proposed. Both methods use an attention mechanism that is
inserted in the original (frozen) DCNN and is trained to derive class
activation maps (CAMs) from the last convolutional layer's feature maps. During
training, CAMs are applied to the feature maps (L-CAM-Fm) or the input image
(L-CAM-Img) forcing the attention mechanism to learn the image regions
explaining the DCNN's outcome. Experimental evaluation on ImageNet shows that
the proposed methods achieve competitive results while requiring a single
forward pass at the inference stage. Moreover, based on the derived
explanations a comprehensive qualitative analysis is performed providing
valuable insight for understanding the reasons behind classification errors,
including possible dataset biases affecting the trained classifier.Comment: Accepted for publication; to be included in Proc. ECCV Workshops
2022. The version posted here is the "submitted manuscript" versio