Search CORE

98 research outputs found

Learning Invariant Representations for Deep Latent Variable Models

Author: Wieser Mario
Publication venue
Publication date: 01/01/2020
Field of study

Deep latent variable models introduce a new class of generative models which are able to handle unstructured data and encode non-linear dependencies. Despite their known flexibility, these models are frequently not invariant against target-specific transformations. Therefore, they suffer from model mismatches and are challenging to interpret or control. We employ the concept of symmetry transformations from physics to formally describe these invariances. In this thesis, we investigate how we can model invariances when a symmetry transformation is either known or unknown. As a consequence, we make contributions in the domain of variable compression under side information and generative modelling. In our first contribution, we investigate the problem where a symmetry transformation is known yet not implicitly learned by the model. Specifically, we consider the task of estimating mutual information in the context of the deep information bottleneck which is not invariant against monotone transformations. To address this limitation, we extend the deep information bottleneck with a copula construction. In our second contribution, we address the problem of learning target-invariant subspaces for generative models. In this case, the symmetry transformation is unknown and has to be learned from data. We achieve this by formulating a deep information bottleneck with a target and a target-invariant subspace. To ensure invariance, we provide a continuous mutual information regulariser based on adversarial training. In our last contribution, we introduce an improved method for learning unknown symmetry transformations with cycle-consistency. To do so, we employ the equivalent deep information bottleneck method with a partitioned latent space. However, we ensure target-invariance by utilizing a cycle-consistency loss in the latent space. As a result, we overcome potential convergence issues introduced by adversarial training and are able to deal with mixed data. In summary, each of our presented models provide an attempt to better control and understand deep latent variables models by learning symmetry transformations. We demonstrated the effectiveness of our contributions with an extensive evaluation on both artificial and real-world experiments

edoc

VAE with a VampPrior

Author: Tomczak Jakub M.
Welling Max
Publication venue
Publication date: 01/01/2018
Field of study

Many different methods to train deep generative models have been introduced in the past. In this paper, we propose to extend the variational auto-encoder (VAE) framework with a new type of prior which we call "Variational Mixture of Posteriors" prior, or VampPrior for short. The VampPrior consists of a mixture distribution (e.g., a mixture of Gaussians) with components given by variational posteriors conditioned on learnable pseudo-inputs. We further extend this prior to a two layer hierarchical model and show that this architecture with a coupled prior and posterior, learns significantly better models. The model also avoids the usual local optima issues related to useless latent dimensions that plague VAEs. We provide empirical studies on six datasets, namely, static and binary MNIST, OMNIGLOT, Caltech 101 Silhouettes, Frey Faces and Histopathology patches, and show that applying the hierarchical VampPrior delivers state-of-the-art results on all datasets in the unsupervised permutation invariant setting and the best results or comparable to SOTA methods for the approach with convolutional networks.Comment: 16 pages, final version, AISTATS 201

arXiv.org e-Print Archive

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Learning Sparse Latent Representations with the Deep Copula Information Bottleneck

Author: Murezzan Damian
Roth Volker
Wieczorek Aleksander
Wieser Mario
Publication venue
Publication date: 01/01/2018
Field of study

Deep latent variable models are powerful tools for representation learning. In this paper, we adopt the deep information bottleneck model, identify its shortcomings and propose a model that circumvents them. To this end, we apply a copula transformation which, by restoring the invariance properties of the information bottleneck method, leads to disentanglement of the features in the latent space. Building on that, we show how this transformation translates to sparsity of the latent space in the new model. We evaluate our method on artificial and real data.Comment: Published as a conference paper at ICLR 2018. Aleksander Wieczorek and Mario Wieser contributed equally to this wor

arXiv.org e-Print Archive

edoc

Federated Variational Inference Methods for Structured Latent Variable Models

Author: Hassan Conor
Mengersen Kerrie
Salomone Robert
Publication venue
Publication date: 07/07/2023
Field of study

Federated learning methods enable model training across distributed data sources without data leaving their original locations and have gained increasing interest in various fields. However, existing approaches are limited, excluding many structured probabilistic models. We present a general and elegant solution based on structured variational inference, widely used in Bayesian machine learning, adapted for the federated setting. Additionally, we provide a communication-efficient variant analogous to the canonical FedAvg algorithm. The proposed algorithms' effectiveness is demonstrated, and their performance is compared with hierarchical Bayesian neural networks and topic models

arXiv.org e-Print Archive

Handling incomplete heterogeneous data using VAEs.

Author: Ghahramani Z
Nazábal A
Olmos PM
Valera I
Publication venue: Pattern Recognition
Publication date: 22/05/2020
Field of study

Variational autoencoders (VAEs), as well as other generative models, have been shown to be efficient and accurate for capturing the latent structure of vast amounts of complex high-dimensional data. However, existing VAEs can still not directly handle data that are heterogenous (mixed continuous and discrete) or incomplete (with missing data at random), which is indeed common in real-world applications. In this paper, we propose a general framework to design VAEs suitable for fitting incomplete heterogenous data. The proposed HI-VAE includes likelihood models for real-valued, positive real valued, interval, categorical, ordinal and count data, and allows accurate estimation (and potentially imputation) of missing data. Furthermore, HI-VAE presents competitive predictive performance in supervised tasks, outperforming supervised models when trained on incomplete data

arXiv.org e-Print Archive

Universidad Carlos III de Madrid e-Archivo

Apollo (Cambridge)

CUED - Cambridge University Engineering Department