19 research outputs found
Towards Visually Explaining Variational Autoencoders
Recent advances in Convolutional Neural Network (CNN) model interpretability
have led to impressive progress in visualizing and understanding model
predictions. In particular, gradient-based visual attention methods have driven
much recent effort in using visual attention maps as a means for visual
explanations. A key problem, however, is these methods are designed for
classification and categorization tasks, and their extension to explaining
generative models, e.g. variational autoencoders (VAE) is not trivial. In this
work, we take a step towards bridging this crucial gap, proposing the first
technique to visually explain VAEs by means of gradient-based attention. We
present methods to generate visual attention from the learned latent space, and
also demonstrate such attention explanations serve more than just explaining
VAE predictions. We show how these attention maps can be used to localize
anomalies in images, demonstrating state-of-the-art performance on the MVTec-AD
dataset. We also show how they can be infused into model training, helping
bootstrap the VAE into learning improved latent space disentanglement,
demonstrated on the Dsprites dataset
Bayesian Prompt Learning for Image-Language Model Generalization
Foundational image-language models have generated considerable interest due to their efficient adaptation to downstream tasks by prompt learning. Prompt learning treats part of the language model input as trainable while freezing the rest, and optimizes an Empirical Risk Minimization objective. However, Empirical Risk Minimization is known to suffer from distributional shifts which hurt generalizability to prompts unseen during training. By leveraging the regularization ability of Bayesian methods, we frame prompt learning from the Bayesian perspective and formulate it as a variational inference problem. Our approach regularizes the prompt space, reduces overfitting to the seen prompts and improves the prompt generalization on unseen prompts. Our framework is implemented by modeling the input prompt space in a probabilistic manner, as an a priori distribution which makes our proposal compatible with prompt learning approaches that are unconditional or conditional on the image. We demonstrate empirically on 15 benchmarks that Bayesian prompt learning provides an appropriate coverage of the prompt space, prevents learning spurious features, and exploits transferable invariant features. This results in better generalization of unseen prompts, even across different datasets and domains. Code available at: https://github.com/saic-fi/Bayesian-Prompt-Learnin
Non-Autoregressive Diffusion-based Temporal Point Processes for Continuous-Time Long-Term Event Prediction
Continuous-time long-term event prediction plays an important role in many
application scenarios. Most existing works rely on autoregressive frameworks to
predict event sequences, which suffer from error accumulation, thus
compromising prediction quality. Inspired by the success of denoising diffusion
probabilistic models, we propose a diffusion-based non-autoregressive temporal
point process model for long-term event prediction in continuous time. Instead
of generating events one at a time in an autoregressive way, our model predicts
the future event sequence entirely as a whole. In order to perform diffusion
processes on event sequences, we develop a bidirectional map between target
event sequences and the Euclidean vector space. Furthermore, we design a novel
denoising network to capture both sequential and contextual features for better
sample quality. Extensive experiments are conducted to prove the superiority of
our proposed model over state-of-the-art methods on long-term event prediction
in continuous time. To the best of our knowledge, this is the first work to
apply diffusion methods to long-term event prediction problems
Bayesian Prompt Learning for Image-Language Model Generalization
Foundational image-language models have generated considerable interest due
to their efficient adaptation to downstream tasks by prompt learning. Prompt
learning treats part of the language model input as trainable while freezing
the rest, and optimizes an Empirical Risk Minimization objective. However,
Empirical Risk Minimization is known to suffer from distributional shifts which
hurt generalizability to prompts unseen during training. By leveraging the
regularization ability of Bayesian methods, we frame prompt learning from the
Bayesian perspective and formulate it as a variational inference problem. Our
approach regularizes the prompt space, reduces overfitting to the seen prompts
and improves the prompt generalization on unseen prompts. Our framework is
implemented by modeling the input prompt space in a probabilistic manner, as an
a priori distribution which makes our proposal compatible with prompt learning
approaches that are unconditional or conditional on the image. We demonstrate
empirically on 15 benchmarks that Bayesian prompt learning provides an
appropriate coverage of the prompt space, prevents learning spurious features,
and exploits transferable invariant features. This results in better
generalization of unseen prompts, even across different datasets and domains
Long-Term Anticipation of Activities with Cycle Consistency
With the success of deep learning methods in analyzing activities in videos,
more attention has recently been focused towards anticipating future
activities. However, most of the work on anticipation either analyzes a
partially observed activity or predicts the next action class. Recently, new
approaches have been proposed to extend the prediction horizon up to several
minutes in the future and that anticipate a sequence of future activities
including their durations. While these works decouple the semantic
interpretation of the observed sequence from the anticipation task, we propose
a framework for anticipating future activities directly from the features of
the observed frames and train it in an end-to-end fashion. Furthermore, we
introduce a cycle consistency loss over time by predicting the past activities
given the predicted future. Our framework achieves state-of-the-art results on
two datasets: the Breakfast dataset and 50Salads.Comment: GCPR 202
A NOVEL APPROACH FOR LEARNING TEMPORAL POINT PROCESS
In this paper, we presented a novel methodology for learning temporal point process based on the
implementation of one-dimensional numerical integration techniques. The implementation of numerical
methodology is used for linearizing negative maximum likelihood (neML) function to enable backpropagation
of neML derivative. The presented approach is tested on highway toll dataset. Moreover, four different wellknown
point process baseline models were compared: first-order and second-order polynomial Poisson
inhomogeneous process and Hawkes with exponential and Gaussian kernel. The results showed that different
numerical integration techniques influence the quality of the obtained models