Search CORE

2 research outputs found

Constraining Variational Inference with Geometric Jensen-Shannon Divergence.

Author: Deasy Jacob
Lió Pietro
Simidjievski Nikola
Publication venue: NeurIPS
Publication date: 01/01/2020
Field of study

We examine the problem of controlling divergences for latent space regularisation in variational autoencoders. Specifically, when aiming to reconstruct example

x\in\mathbb{R}^{m}

via latent space

z\in\mathbb{R}^{n}

(

n\leq m

), while balancing this against the need for generalisable latent representations. We present a regularisation mechanism based on the skew-geometric Jensen-Shannon divergence

\left(\textrm{JS}^{\textrm{G}_{\alpha}}\right)

. We find a variation in

\textrm{JS}^{\textrm{G}_{\alpha}}

, motivated by limiting cases, which leads to an intuitive interpolation between forward and reverse KL in the space of both distributions and divergences. We motivate its potential benefits for VAEs through low-dimensional examples, before presenting quantitative and qualitative results. Our experiments demonstrate that skewing our variant of

\textrm{JS}^{\textrm{G}_{\alpha}}

, in the context of

\textrm{JS}^{\textrm{G}_{\alpha}}

-VAEs, leads to better reconstruction and generation when compared to several baseline VAEs. Our approach is entirely unsupervised and utilises only one hyperparameter which can be easily interpreted in latent space.Comment: Camera-ready version, accepted at NeurIPS 202

arXiv.org e-Print Archive

Apollo (Cambridge)

A Jensen-Shannon Divergence Based Loss Function for Bayesian Neural Networks

Author: Ghosh Susanta
Thiagarajan Ponkrshnan
Publication venue
Publication date: 22/09/2022
Field of study

Kullback-Leibler (KL) divergence is widely used for variational inference of Bayesian Neural Networks (BNNs). However, the KL divergence has limitations such as unboundedness and asymmetry. We examine the Jensen-Shannon (JS) divergence that is more general, bounded, and symmetric. We formulate a novel loss function for BNNs based on the geometric JS divergence and show that the conventional KL divergence-based loss function is its special case. We evaluate the divergence part of the proposed loss function in a closed form for a Gaussian prior. For any other general prior, Monte Carlo approximations can be used. We provide algorithms for implementing both of these cases. We demonstrate that the proposed loss function offers an additional parameter that can be tuned to control the degree of regularisation. We derive the conditions under which the proposed loss function regularises better than the KL divergence-based loss function for Gaussian priors and posteriors. We demonstrate performance improvements over the state-of-the-art KL divergence-based BNN on the classification of a noisy CIFAR data set and a biased histopathology data set.Comment: To be submitted for peer review in IEE

arXiv.org e-Print Archive