Search CORE

64 research outputs found

Jointly trained image and video generation using residual vectors

Author: Dandi Yatin
Das Aniket
Namboodiri Vinay P.
Rai Piyush
Singhal Soumye
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 17/12/2019
Field of study

In this work, we propose a modeling technique for jointly training image and video generation models by simultaneously learning to map latent variables with a fixed prior onto real images and interpolate over images to generate videos. The proposed approach models the variations in representations using residual vectors encoding the change at each time step over a summary vector for the entire video. We utilize the technique to jointly train an image generation model with a fixed prior along with a video generation model lacking constraints such as disentanglement. The joint training enables the image generator to exploit temporal information while the video generation model learns to flexibly share information across frames. Moreover, experimental results verify our approach's compatibility with pre-training on videos or images and training on datasets containing a mixture of both. A comprehensive set of quantitative and qualitative evaluations reveal the improvements in sample quality and diversity over both video generation and image generation baselines. We further demonstrate the technique's capabilities of exploiting similarity in features across frames by applying it to a model based on decomposing the video into motion and content. The proposed model allows minor variations in content across frames while maintaining the temporal dependence through latent vectors encoding the pose or motion features.Comment: Accepted in 2020 Winter Conference on Applications of Computer Vision (WACV '20

arXiv.org e-Print Archive

OPUS

Crossref

Unsupervised Pitch-Timbre Disentanglement of Musical Instruments Using a Jacobian Disentangled Sequential Autoencoder

Author: Dixon S
Ewert S
ICASSP 2024 - 2024 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP)
Luo Y-J
Publication venue: Institute of Electrical and Electronics Engineers (IEEE)
Publication date: 14/04/2024
Field of study

Queen Mary Research Online

Unsupervised learning with contrastive latent variable models

Author: Ghosh Soumya
Ng Kenney
Severson Kristen
Publication venue
Publication date: 14/11/2018
Field of study

In unsupervised learning, dimensionality reduction is an important tool for data exploration and visualization. Because these aims are typically open-ended, it can be useful to frame the problem as looking for patterns that are enriched in one dataset relative to another. These pairs of datasets occur commonly, for instance a population of interest vs. control or signal vs. signal free recordings.However, there are few methods that work on sets of data as opposed to data points or sequences. Here, we present a probabilistic model for dimensionality reduction to discover signal that is enriched in the target dataset relative to the background dataset. The data in these sets do not need to be paired or grouped beyond set membership. By using a probabilistic model where some structure is shared amongst the two datasets and some is unique to the target dataset, we are able to recover interesting structure in the latent space of the target dataset. The method also has the advantages of a probabilistic model, namely that it allows for the incorporation of prior information, handles missing data, and can be generalized to different distributional assumptions. We describe several possible variations of the model and demonstrate the application of the technique to de-noising, feature selection, and subgroup discovery settings

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Temporally Disentangled Representation Learning under Unknown Nonstationarity

Author: Chen Guangyi
Dong Xinshuai
Fan Yewen
Niebles Juan Carlos
Song Xiangchen
Xing Eric
Yao Weiran
Zhang Kun
Publication venue
Publication date: 28/10/2023
Field of study

In unsupervised causal representation learning for sequential data with time-delayed latent causal influences, strong identifiability results for the disentanglement of causally-related latent variables have been established in stationary settings by leveraging temporal structure. However, in nonstationary setting, existing work only partially addressed the problem by either utilizing observed auxiliary variables (e.g., class labels and/or domain indexes) as side information or assuming simplified latent causal dynamics. Both constrain the method to a limited range of scenarios. In this study, we further explored the Markov Assumption under time-delayed causally related process in nonstationary setting and showed that under mild conditions, the independent latent components can be recovered from their nonlinear mixture up to a permutation and a component-wise transformation, without the observation of auxiliary variables. We then introduce NCTRL, a principled estimation framework, to reconstruct time-delayed latent causal variables and identify their relations from measured sequential data only. Empirical evaluations demonstrated the reliable identification of time-delayed latent causal influences, with our methodology substantially outperforming existing baselines that fail to exploit the nonstationarity adequately and then, consequently, cannot distinguish distribution shifts.Comment: NeurIPS 202

arXiv.org e-Print Archive

Hamiltonian Latent Operators for content and motion disentanglement in image sequences

Author: Khan Asif
Storkey Amos J
Publication venue
Publication date: 12/10/2022
Field of study

We introduce \textit{HALO} -- a deep generative model utilising HAmiltonian Latent Operators to reliably disentangle content and motion information in image sequences. The \textit{content} represents summary statistics of a sequence, and \textit{motion} is a dynamic process that determines how information is expressed in any part of the sequence. By modelling the dynamics as a Hamiltonian motion, important desiderata are ensured: (1) the motion is reversible, (2) the symplectic, volume-preserving structure in phase space means paths are continuous and are not divergent in the latent space. Consequently, the nearness of sequence frames is realised by the nearness of their coordinates in the phase space, which proves valuable for disentanglement and long-term sequence generation. The sequence space is generally comprised of different types of dynamical motions. To ensure long-term separability and allow controlled generation, we associate every motion with a unique Hamiltonian that acts in its respective subspace. We demonstrate the utility of \textit{HALO} by swapping the motion of a pair of sequences, controlled generation, and image rotations.Comment: Conference paper at NeurIPS 202

arXiv.org e-Print Archive

Edinburgh Research Explorer