Search CORE

420 research outputs found

Balancing reconstruction error and Kullback-Leibler divergence in Variational Autoencoders

Author: Asperti Andrea
Trentin Matteo
Publication venue
Publication date: 18/02/2020
Field of study

In the loss function of Variational Autoencoders there is a well known tension between two components: the reconstruction loss, improving the quality of the resulting images, and the Kullback-Leibler divergence, acting as a regularizer of the latent space. Correctly balancing these two components is a delicate issue, easily resulting in poor generative behaviours. In a recent work, Dai and Wipf obtained a sensible improvement by allowing the network to learn the balancing factor during training, according to a suitable loss function. In this article, we show that learning can be replaced by a simple deterministic computation, helping to understand the underlying mechanism, and resulting in a faster and more accurate behaviour. On typical datasets such as Cifar and Celeba, our technique sensibly outperforms all previous VAE architectures

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Variance Loss in Variational Autoencoders

Author: Asperti Andrea
Publication venue
Publication date: 01/01/2020
Field of study

In this article, we highlight what appears to be major issue of Variational Autoencoders, evinced from an extensive experimentation with different network architectures and datasets: the variance of generated data is significantly lower than that of training data. Since generative models are usually evaluated with metrics such as the Frechet Inception Distance (FID) that compare the distributions of (features of) real versus generated images, the variance loss typically results in degraded scores. This problem is particularly relevant in a two stage setting, where we use a second VAE to sample in the latent space of the first VAE. The minor variance creates a mismatch between the actual distribution of latent variables and those generated by the second VAE, that hinders the beneficial effects of the second stage. Renormalizing the output of the second VAE towards the expected normal spherical distribution, we obtain a sudden burst in the quality of generated samples, as also testified in terms of FID.Comment: Article accepted at the Sixth International Conference on Machine Learning, Optimization, and Data Science. July 19-23, 2020 - Certosa di Pontignano, Siena, Ital

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Sparsity in Variational Autoencoders

Author: Asperti Andrea
Publication venue
Publication date: 01/01/2019
Field of study

Working in high-dimensional latent spaces, the internal encoding of data in Variational Autoencoders becomes naturally sparse. We discuss this known but controversial phenomenon sometimes refereed to as overpruning, to emphasize the under-use of the model capacity. In fact, it is an important form of self-regularization, with all the typical benefits associated with sparsity: it forces the model to focus on the really important features, highly reducing the risk of overfitting. Especially, it is a major methodological guide for the correct tuning of the model capacity, progressively augmenting it to attain sparsity, or conversely reducing the dimension of the network removing links to zeroed out neurons. The degree of sparsity crucially depends on the network architecture: for instance, convolutional networks typically show less sparsity, likely due to the tighter relation of features to different spatial regions of the input.Comment: An Extended Abstract of this survey will be presented at the 1st International Conference on Advances in Signal Processing and Artificial Intelligence (ASPAI' 2019), 20-22 March 2019, Barcelona, Spai

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Variational Autoencoders and the Variable Collapse Phenomenon

Author: Andrea Asperti
Publication venue
Publication date: 01/01/2019
Field of study

In Variational Autoencoders, when working in high-dimensional latent spaces, there is a natural collapse of latent variables with minor significance, that get altogether neglected by the generator. We discuss this known but controversial phenomenon, sometimes referred to as overpruning, to emphasize the under-use of the model capacity. In fact, it is an important form of self-regularization, with all the typical benefits associated with sparsity: it forces the model to focus on the really important features, enhancing their disentanglement and reducing the risk of overfitting. In this article, we discuss the issue, surveying past works, and particularly focusing on the exploitation of the variable collapse phenomenon as a methodological guideline for the correct tuning of the model capacity, and of the loss function parameters

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Constraining Variational Inference with Geometric Jensen-Shannon Divergence.

Author: Deasy Jacob
Lió Pietro
Simidjievski Nikola
Publication venue: NeurIPS
Publication date: 01/01/2020
Field of study

We examine the problem of controlling divergences for latent space regularisation in variational autoencoders. Specifically, when aiming to reconstruct example

x\in\mathbb{R}^{m}

via latent space

z\in\mathbb{R}^{n}

(

n\leq m

), while balancing this against the need for generalisable latent representations. We present a regularisation mechanism based on the skew-geometric Jensen-Shannon divergence

\left(\textrm{JS}^{\textrm{G}_{\alpha}}\right)

. We find a variation in

\textrm{JS}^{\textrm{G}_{\alpha}}

, motivated by limiting cases, which leads to an intuitive interpolation between forward and reverse KL in the space of both distributions and divergences. We motivate its potential benefits for VAEs through low-dimensional examples, before presenting quantitative and qualitative results. Our experiments demonstrate that skewing our variant of

\textrm{JS}^{\textrm{G}_{\alpha}}

, in the context of

\textrm{JS}^{\textrm{G}_{\alpha}}

-VAEs, leads to better reconstruction and generation when compared to several baseline VAEs. Our approach is entirely unsupervised and utilises only one hyperparameter which can be easily interpreted in latent space.Comment: Camera-ready version, accepted at NeurIPS 202

arXiv.org e-Print Archive

Apollo (Cambridge)

Combining Variational Autoencoders and Physical Bias for Improved Microscopy Data Analysis

Author: Biswas Arpan
Kalinin Sergei V.
Ziatdinov Maxim
Publication venue
Publication date: 07/06/2023
Field of study

Electron and scanning probe microscopy produce vast amounts of data in the form of images or hyperspectral data, such as EELS or 4D STEM, that contain information on a wide range of structural, physical, and chemical properties of materials. To extract valuable insights from these data, it is crucial to identify physically separate regions in the data, such as phases, ferroic variants, and boundaries between them. In order to derive an easily interpretable feature analysis, combining with well-defined boundaries in a principled and unsupervised manner, here we present a physics augmented machine learning method which combines the capability of Variational Autoencoders to disentangle factors of variability within the data and the physics driven loss function that seeks to minimize the total length of the discontinuities in images corresponding to latent representations. Our method is applied to various materials, including NiO-LSMO, BiFeO3, and graphene. The results demonstrate the effectiveness of our approach in extracting meaningful information from large volumes of imaging data. The fully notebook containing implementation of the code and analysis workflow is available at https://github.com/arpanbiswas52/PaperNotebooksComment: 20 pages, 7 figures in main text, 4 figures in Supp Ma

arXiv.org e-Print Archive

Notes on the use of variational autoencoders for speech and audio spectrogram modeling

Author: Girin Laurent
Hueber Thomas
Leglaive Simon
Roche Fanny
Publication venue: HAL CCSD
Publication date: 02/09/2019
Field of study

International audienceVariational autoencoders (VAEs) are powerful (deep) generative artificial neural networks. They have been recently used in several papers for speech and audio processing, in particular for the modeling of speech/audio spectrograms. In these papers, very poor theoretical support is given to justify the chosen data representation and decoder likelihood function or the corresponding cost function used for training the VAE. Yet, a nice theoretical statistical framework exists and has been extensively presented and discussed in papers dealing with nonnegative matrix factorization (NMF) of audio spectrograms and its application to audio source separation. In the present paper, we show how this statistical framework applies to VAE-based speech/audio spectrogram modeling. This provides the latter insights on the choice and interpretability of data representation and model parameterization

Dynamics modeling using a combination of Variational Autoencoders and Einsum Networks

Author: Pálsson Guðmundur Orri
Publication venue
Publication date: 13/09/2021
Field of study

Pure OAI Repository