Search CORE

18 research outputs found

A Jensen-Shannon Divergence Based Loss Function for Bayesian Neural Networks

Author: Ghosh Susanta
Thiagarajan Ponkrshnan
Publication venue
Publication date: 22/09/2022
Field of study

Kullback-Leibler (KL) divergence is widely used for variational inference of Bayesian Neural Networks (BNNs). However, the KL divergence has limitations such as unboundedness and asymmetry. We examine the Jensen-Shannon (JS) divergence that is more general, bounded, and symmetric. We formulate a novel loss function for BNNs based on the geometric JS divergence and show that the conventional KL divergence-based loss function is its special case. We evaluate the divergence part of the proposed loss function in a closed form for a Gaussian prior. For any other general prior, Monte Carlo approximations can be used. We provide algorithms for implementing both of these cases. We demonstrate that the proposed loss function offers an additional parameter that can be tuned to control the degree of regularisation. We derive the conditions under which the proposed loss function regularises better than the KL divergence-based loss function for Gaussian priors and posteriors. We demonstrate performance improvements over the state-of-the-art KL divergence-based BNN on the classification of a noisy CIFAR data set and a biased histopathology data set.Comment: To be submitted for peer review in IEE

arXiv.org e-Print Archive

On a generalization of the Jensen-Shannon divergence

Author: Nielsen Frank
Publication venue: 'MDPI AG'
Publication date: 19/12/2019
Field of study

The Jensen-Shannon divergence is a renown bounded symmetrization of the Kullback-Leibler divergence which does not require probability densities to have matching supports. In this paper, we introduce a vector-skew generalization of the scalar

\alpha

-Jensen-Bregman divergences and derive thereof the vector-skew

\alpha

-Jensen-Shannon divergences. We study the properties of these novel divergences and show how to build parametric families of symmetric Jensen-Shannon-type divergences. Finally, we report an iterative algorithm to numerically compute the Jensen-Shannon-type centroids for a set of probability densities belonging to a mixture family: This includes the case of the Jensen-Shannon centroid of a set of categorical distributions or normalized histograms.Comment: 19 pages, 3 figure

arXiv.org e-Print Archive

Multidisciplinary Digital Publishing Institute

Beyond scalar quasi-arithmetic means: Quasi-arithmetic averages and quasi-arithmetic mixtures in information geometry

Author: Nielsen Frank
Publication venue
Publication date: 23/06/2023
Field of study

We generalize quasi-arithmetic means beyond scalars by considering the gradient map of a Legendre type real-valued function. The gradient map of a Legendre type function is proven strictly comonotone with a global inverse. It thus yields a generalization of strictly mononotone and differentiable functions generating scalar quasi-arithmetic means. Furthermore, the Legendre transformation gives rise to pairs of dual quasi-arithmetic averages via the convex duality. We study the invariance and equivariance properties under affine transformations of quasi-arithmetic averages via the lens of dually flat spaces of information geometry. We show how these quasi-arithmetic averages are used to express points on dual geodesics and sided barycenters in the dual affine coordinate systems. We then consider quasi-arithmetic mixtures and describe several parametric and non-parametric statistical models which are closed under the quasi-arithmetic mixture operation.Comment: 20 page

arXiv.org e-Print Archive

Generalized Multimodal ELBO

Author: Daunhawer Imant
Sutter Thomas M.
Vogt Julia E.
Publication venue
Publication date: 04/05/2021
Field of study

Multiple data types naturally co-occur when describing real-world phenomena and learning from them is a long-standing goal in machine learning research. However, existing self-supervised generative models approximating an ELBO are not able to fulfill all desired requirements of multimodal models: their posterior approximation functions lead to a trade-off between the semantic coherence and the ability to learn the joint data distribution. We propose a new, generalized ELBO formulation for multimodal data that overcomes these limitations. The new objective encompasses two previous methods as special cases and combines their benefits without compromises. In extensive experiments, we demonstrate the advantage of the proposed method compared to state-of-the-art models in self-supervised, generative learning tasks.Comment: 2021 ICL

arXiv.org e-Print Archive

Repository for Publications and Research Data

Multimodal Generative Learning Utilizing Jensen-Shannon-Divergence

Author: Daunhawer Imant
Sutter Thomas M.
Vogt Julia E.
Publication venue
Publication date: 02/11/2020
Field of study

Learning from different data types is a long-standing goal in machine learning research, as multiple information sources co-occur when describing natural phenomena. However, existing generative models that approximate a multimodal ELBO rely on difficult or inefficient training schemes to learn a joint distribution and the dependencies between modalities. In this work, we propose a novel, efficient objective function that utilizes the Jensen-Shannon divergence for multiple distributions. It simultaneously approximates the unimodal and joint multimodal posteriors directly via a dynamic prior. In addition, we theoretically prove that the new multimodal JS-divergence (mmJSD) objective optimizes an ELBO. In extensive experiments, we demonstrate the advantage of the proposed mmJSD model compared to previous work in unsupervised, generative learning tasks.Comment: Accepted at NeurIPS 2020, camera-ready versio

arXiv.org e-Print Archive

Simulation of complex dynamics of mean-field $p$ -spin models using measurement-based quantum feedback control

Author: Deutsch Ivan H.
Jessen Poul S.
Muñoz-Arias Manuel H.
Poggi Pablo M.
Publication venue: 'American Physical Society (APS)'
Publication date: 01/08/2020
Field of study

We study the application of a new method for simulating nonlinear dynamics of many-body spin systems using quantum measurement and feedback [Mu\~noz-Arias et al., Phys. Rev. Lett. 124, 110503 (2020)] to a broad class of many-body models known as

p

-spin Hamiltonians, which describe Ising-like models on a completely connected graph with

p

-body interactions. The method simulates the desired mean field dynamics in the thermodynamic limit by combining nonprojective measurements of a component of the collective spin with a global rotation conditioned on the measurement outcome. We apply this protocol to simulate the dynamics of the

p

-spin Hamiltonians and demonstrate how different aspects of criticality in the mean-field regime are readily accessible with our protocol. We study applications including properties of dynamical phase transitions and the emergence of spontaneous symmetry breaking in the adiabatic dynamics of the collective spin for different values of the parameter

p

. We also demonstrate how this method can be employed to study the quantum-to-classical transition in the dynamics continuously as a function of system size.Comment: 16 pages, 7 figure

arXiv.org e-Print Archive

The University of Arizona