Search CORE

754 research outputs found

Kullback-Leibler Proximal Variational Inference

Author: Baqué Pierre Bruno
Fleuret François
Fua Pascal
Khan Mohammad Emtiyaz
Publication venue
Publication date: 20/12/2015
Field of study

We propose a new variational inference method based on a proximal framework that uses the Kullback-Leibler (KL) divergence as the proximal term. We make two contributions towards exploiting the geometry and structure of the variational bound. Firstly, we propose a KL proximal-point algorithm and show its equivalence to variational inference with natural gradients (e.g. stochastic variational inference). Secondly, we use the proximal framework to derive efficient variational algorithms for non-conjugate models. We propose a splitting procedure to separate non-conjugate terms from conjugate ones. We linearize the non-conjugate terms to obtain subproblems that admit a closed-form solution. Overall, our approach converts inference in a non-conjugate model to subproblems that involve inference in well-known conjugate models. We show that our method is applicable to a wide variety of models and can result in computationally efficient algorithms. Applications to real-world datasets show comparable performance to existing methods

Infoscience - École polytechnique fédérale de Lausanne

Kullback-Leibler Proximal Variational Inference

Author: Baqué Pierre
Fleuret Francois
Fua Pascal
Khan Emtiyaz
Publication venue
Publication date: 19/12/2015
Field of study

Infoscience - École polytechnique fédérale de Lausanne

Gradient Flows in Filtering and Fisher-Rao Geometry

Author: atkinson
bauschke
kullback
kushner
magnus
mahalanobis
rao
risken
skovgaard
stratonovich
villani
Publication venue
Publication date: 29/10/2017
Field of study

Uncertainty propagation and filtering can be interpreted as gradient flows with respect to suitable metrics in the infinite dimensional manifold of probability density functions. Such a viewpoint has been put forth in recent literature, and a systematic way to formulate and solve the same for linear Gaussian systems has appeared in our previous work where the gradient flows were realized via proximal operators with respect to Wasserstein metric arising in optimal mass transport. In this paper, we derive the evolution equations as proximal operators with respect to Fisher-Rao metric arising in information geometry. We develop the linear Gaussian case in detail and show that a template two step optimization procedure proposed earlier by the authors still applies. Our objective is to provide new geometric interpretations of known equations in filtering, and to clarify the implication of different choices of metric

arXiv.org e-Print Archive

Crossref

On the Minimization of Convex Functionals of Probability Distributions Under Band Constraints

Author: Fauss Michael
Zoubir Abdelhak M.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

The problem of minimizing convex functionals of probability distributions is solved under the assumption that the density of every distribution is bounded from above and below. A system of sufficient and necessary first-order optimality conditions as well as a bound on the optimality gap of feasible candidate solutions are derived. Based on these results, two numerical algorithms are proposed that iteratively solve the system of optimality conditions on a grid of discrete points. Both algorithms use a block coordinate descent strategy and terminate once the optimality gap falls below the desired tolerance. While the first algorithm is conceptually simpler and more efficient, it is not guaranteed to converge for objective functions that are not strictly convex. This shortcoming is overcome in the second algorithm, which uses an additional outer proximal iteration, and, which is proven to converge under mild assumptions. Two examples are given to demonstrate the theoretical usefulness of the optimality conditions as well as the high efficiency and accuracy of the proposed numerical algorithms.Comment: 13 pages, 5 figures, 2 tables, published in the IEEE Transactions on Signal Processing. In previous versions, the example in Section VI.B contained some mistakes and inaccuracies, which have been fixed in this versio

arXiv.org e-Print Archive

TUbiblio

Proximity Operators of Discrete Information Divergences

Author: Chierchia Giovanni
Gheche Mireille El
Pesquet Jean-Christophe
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 26/04/2017
Field of study

Information divergences allow one to assess how close two distributions are from each other. Among the large panel of available measures, a special attention has been paid to convex

\varphi

-divergences, such as Kullback-Leibler, Jeffreys-Kullback, Hellinger, Chi-Square, Renyi, and I

_{\alpha}

divergences. While

\varphi

-divergences have been extensively studied in convex analysis, their use in optimization problems often remains challenging. In this regard, one of the main shortcomings of existing methods is that the minimization of

\varphi

-divergences is usually performed with respect to one of their arguments, possibly within alternating optimization techniques. In this paper, we overcome this limitation by deriving new closed-form expressions for the proximity operator of such two-variable functions. This makes it possible to employ standard proximal methods for efficiently solving a wide range of convex optimization problems involving

\varphi

-divergences. In addition, we show that these proximity operators are useful to compute the epigraphical projection of several functions of practical interest. The proposed proximal tools are numerically validated in the context of optimal query execution within database management systems, where the problem of selectivity estimation plays a central role. Experiments are carried out on small to large scale scenarios

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM

HAL-Rennes 1

Analysis of Langevin Monte Carlo via convex optimization

Author: Durmus Alain
Majewski Szymon
Miasojedow Błażej
Publication venue
Publication date: 28/03/2018
Field of study

In this paper, we provide new insights on the Unadjusted Langevin Algorithm. We show that this method can be formulated as a first order optimization algorithm of an objective functional defined on the Wasserstein space of order

2

. Using this interpretation and techniques borrowed from convex optimization, we give a non-asymptotic analysis of this method to sample from logconcave smooth target distribution on

\mathbb{R}^d

. Based on this interpretation, we propose two new methods for sampling from a non-smooth target distribution, which we analyze as well. Besides, these new algorithms are natural extensions of the Stochastic Gradient Langevin Dynamics (SGLD) algorithm, which is a popular extension of the Unadjusted Langevin Algorithm. Similar to SGLD, they only rely on approximations of the gradient of the target log density and can be used for large-scale Bayesian inference

arXiv.org e-Print Archive

HAL-Polytechnique