9 research outputs found
Towards Visually Explaining Variational Autoencoders
Recent advances in Convolutional Neural Network (CNN) model interpretability
have led to impressive progress in visualizing and understanding model
predictions. In particular, gradient-based visual attention methods have driven
much recent effort in using visual attention maps as a means for visual
explanations. A key problem, however, is these methods are designed for
classification and categorization tasks, and their extension to explaining
generative models, e.g. variational autoencoders (VAE) is not trivial. In this
work, we take a step towards bridging this crucial gap, proposing the first
technique to visually explain VAEs by means of gradient-based attention. We
present methods to generate visual attention from the learned latent space, and
also demonstrate such attention explanations serve more than just explaining
VAE predictions. We show how these attention maps can be used to localize
anomalies in images, demonstrating state-of-the-art performance on the MVTec-AD
dataset. We also show how they can be infused into model training, helping
bootstrap the VAE into learning improved latent space disentanglement,
demonstrated on the Dsprites dataset
Attribute disentanglement with gradient reversal for interactive fashion retrieval
Interactive fashion search is gaining more and more interest thanks to the rapid diffusion of online retailers. It allows users to browse fashion items and perform attribute manipulations, modifying parts or details of given garments. To successfully model and analyze garments at such a fine-grained level, it is necessary to obtain attribute-wise representations, separating information relative to different characteristics. In this work we propose an attribute disentanglement method based on attribute classifiers and the usage of gradient reversal layers. This combination allows us to learn attribute-specific features, removing unwanted details from each representation. We test the effectiveness of our learned features in a fashion attribute manipulation task, obtaining state of the art results. Furthermore, to favor training stability we present a novel loss balancing approach, preventing reversed losses to diverge during the optimization process