3 research outputs found
Robustness of Visual Explanations to Common Data Augmentation
As the use of deep neural networks continues to grow, understanding their
behaviour has become more crucial than ever. Post-hoc explainability methods
are a potential solution, but their reliability is being called into question.
Our research investigates the response of post-hoc visual explanations to
naturally occurring transformations, often referred to as augmentations. We
anticipate explanations to be invariant under certain transformations, such as
changes to the colour map while responding in an equivariant manner to
transformations like translation, object scaling, and rotation. We have found
remarkable differences in robustness depending on the type of transformation,
with some explainability methods (such as LRP composites and Guided Backprop)
being more stable than others. We also explore the role of training with data
augmentation. We provide evidence that explanations are typically less robust
to augmentation than classification performance, regardless of whether data
augmentation is used in training or not.Comment: Accepted to The 2nd Explainable AI for Computer Vision (XAI4CV)
Workshop at CVPR 202
On convex conceptual regions in deep network representations
The current study of human-machine alignment aims at understanding the
geometry of latent spaces and the correspondence to human representations.
G\"ardenfors' conceptual spaces is a prominent framework for understanding
human representations. Convexity of object regions in conceptual spaces is
argued to promote generalizability, few-shot learning, and intersubject
alignment. Based on these insights, we investigate the notion of convexity of
concept regions in machine-learned latent spaces. We develop a set of tools for
measuring convexity in sampled data and evaluate emergent convexity in layered
representations of state-of-the-art deep networks. We show that convexity is
robust to basic re-parametrization, hence, meaningful as a quality of
machine-learned latent spaces. We find that approximate convexity is pervasive
in neural representations in multiple application domains, including models of
images, audio, human activity, text, and brain data. We measure convexity
separately for labels (i.e., targets for fine-tuning) and other concepts.
Generally, we observe that fine-tuning increases the convexity of label
regions, while for more general concepts, it depends on the alignment of the
concept with the fine-tuning objective. We find evidence that pre-training
convexity of class label regions predicts subsequent fine-tuning performance
Image classification with symbolic hints using limited resources
Typical machine learning classification benchmark problems often ignore the full input data structures present in real-world classification problems. Here we aim to represent additional information as "hints" for classification. We show that under a specific realistic conditional independence assumption, the hint information can be included by late fusion. In two experiments involving image classification with hints taking the form of text metadata, we demonstrate the feasibility and performance of the fusion scheme. We fuse the output of pre-trained image classifiers with the output of pre-trained text models. We show that calibration of the pre-trained models is crucial for the performance of the fused model. We compare the performance of the fusion scheme with a mid-level fusion scheme based on support vector machines and find that these two methods tend to perform quite similarly, albeit the late fusion scheme has only negligible computational costs