64 research outputs found
Jointly trained image and video generation using residual vectors
In this work, we propose a modeling technique for jointly training image and
video generation models by simultaneously learning to map latent variables with
a fixed prior onto real images and interpolate over images to generate videos.
The proposed approach models the variations in representations using residual
vectors encoding the change at each time step over a summary vector for the
entire video. We utilize the technique to jointly train an image generation
model with a fixed prior along with a video generation model lacking
constraints such as disentanglement. The joint training enables the image
generator to exploit temporal information while the video generation model
learns to flexibly share information across frames. Moreover, experimental
results verify our approach's compatibility with pre-training on videos or
images and training on datasets containing a mixture of both. A comprehensive
set of quantitative and qualitative evaluations reveal the improvements in
sample quality and diversity over both video generation and image generation
baselines. We further demonstrate the technique's capabilities of exploiting
similarity in features across frames by applying it to a model based on
decomposing the video into motion and content. The proposed model allows minor
variations in content across frames while maintaining the temporal dependence
through latent vectors encoding the pose or motion features.Comment: Accepted in 2020 Winter Conference on Applications of Computer Vision
(WACV '20
Unsupervised learning with contrastive latent variable models
In unsupervised learning, dimensionality reduction is an important tool for
data exploration and visualization. Because these aims are typically
open-ended, it can be useful to frame the problem as looking for patterns that
are enriched in one dataset relative to another. These pairs of datasets occur
commonly, for instance a population of interest vs. control or signal vs.
signal free recordings.However, there are few methods that work on sets of data
as opposed to data points or sequences. Here, we present a probabilistic model
for dimensionality reduction to discover signal that is enriched in the target
dataset relative to the background dataset. The data in these sets do not need
to be paired or grouped beyond set membership. By using a probabilistic model
where some structure is shared amongst the two datasets and some is unique to
the target dataset, we are able to recover interesting structure in the latent
space of the target dataset. The method also has the advantages of a
probabilistic model, namely that it allows for the incorporation of prior
information, handles missing data, and can be generalized to different
distributional assumptions. We describe several possible variations of the
model and demonstrate the application of the technique to de-noising, feature
selection, and subgroup discovery settings
Temporally Disentangled Representation Learning under Unknown Nonstationarity
In unsupervised causal representation learning for sequential data with
time-delayed latent causal influences, strong identifiability results for the
disentanglement of causally-related latent variables have been established in
stationary settings by leveraging temporal structure. However, in nonstationary
setting, existing work only partially addressed the problem by either utilizing
observed auxiliary variables (e.g., class labels and/or domain indexes) as side
information or assuming simplified latent causal dynamics. Both constrain the
method to a limited range of scenarios. In this study, we further explored the
Markov Assumption under time-delayed causally related process in nonstationary
setting and showed that under mild conditions, the independent latent
components can be recovered from their nonlinear mixture up to a permutation
and a component-wise transformation, without the observation of auxiliary
variables. We then introduce NCTRL, a principled estimation framework, to
reconstruct time-delayed latent causal variables and identify their relations
from measured sequential data only. Empirical evaluations demonstrated the
reliable identification of time-delayed latent causal influences, with our
methodology substantially outperforming existing baselines that fail to exploit
the nonstationarity adequately and then, consequently, cannot distinguish
distribution shifts.Comment: NeurIPS 202
Hamiltonian Latent Operators for content and motion disentanglement in image sequences
We introduce \textit{HALO} -- a deep generative model utilising HAmiltonian
Latent Operators to reliably disentangle content and motion information in
image sequences. The \textit{content} represents summary statistics of a
sequence, and \textit{motion} is a dynamic process that determines how
information is expressed in any part of the sequence. By modelling the dynamics
as a Hamiltonian motion, important desiderata are ensured: (1) the motion is
reversible, (2) the symplectic, volume-preserving structure in phase space
means paths are continuous and are not divergent in the latent space.
Consequently, the nearness of sequence frames is realised by the nearness of
their coordinates in the phase space, which proves valuable for disentanglement
and long-term sequence generation. The sequence space is generally comprised of
different types of dynamical motions. To ensure long-term separability and
allow controlled generation, we associate every motion with a unique
Hamiltonian that acts in its respective subspace. We demonstrate the utility of
\textit{HALO} by swapping the motion of a pair of sequences, controlled
generation, and image rotations.Comment: Conference paper at NeurIPS 202
- …