Search CORE

19,451 research outputs found

A Geometric Variational Approach to Bayesian Inference

Author: Abhijoy Saha
Barber D.
Bauer M.
Bhattacharyya A
Bishop C. M
Broderick T.
Chen T.
Ghahramani Z.
Hernández-Lobato J.
Hoffman M.
Hoffman M. D.
Jaakkola T.
Karthik Bharath
Kass R. E.
Kingma D. P.
Kucukelbir A.
Lang S
Li Y.
Minka T. P
Rao C. R
Rezende D.
Rényi A
Saul L. K.
Sebastian Kurtek
Sigillito V. G.
Srivastava A.
Tan L. S
Wang C.
Yeung D.
Publication venue
Publication date: 27/03/2019
Field of study

We propose a novel Riemannian geometric framework for variational inference in Bayesian models based on the nonparametric Fisher-Rao metric on the manifold of probability density functions. Under the square-root density representation, the manifold can be identified with the positive orthant of the unit hypersphere in L2, and the Fisher-Rao metric reduces to the standard L2 metric. Exploiting such a Riemannian structure, we formulate the task of approximating the posterior distribution as a variational problem on the hypersphere based on the alpha-divergence. This provides a tighter lower bound on the marginal distribution when compared to, and a corresponding upper bound unavailable with, approaches based on the Kullback-Leibler divergence. We propose a novel gradient-based algorithm for the variational problem based on Frechet derivative operators motivated by the geometry of the Hilbert sphere, and examine its properties. Through simulations and real-data applications, we demonstrate the utility of the proposed geometric framework and algorithm on several Bayesian models

arXiv.org e-Print Archive

Crossref

Repository@Nottingham

FigShare

Diffusion Approximations for Online Principal Component Estimation and Global Convergence

Author: Li Chris Junchi
Liu Han
Wang Mengdi
Zhang Tong
Publication venue
Publication date: 01/01/2017
Field of study

In this paper, we propose to adopt the diffusion approximation tools to study the dynamics of Oja's iteration which is an online stochastic gradient descent method for the principal component analysis. Oja's iteration maintains a running estimate of the true principal component from streaming data and enjoys less temporal and spatial complexities. We show that the Oja's iteration for the top eigenvector generates a continuous-state discrete-time Markov chain over the unit sphere. We characterize the Oja's iteration in three phases using diffusion approximation and weak convergence tools. Our three-phase analysis further provides a finite-sample error bound for the running estimate, which matches the minimax information lower bound for principal component analysis under the additional assumption of bounded samples.Comment: Appeared in NIPS 201

arXiv.org e-Print Archive

Princeton University Open Access Repository

From Proximal Point Method to Nesterov's Acceleration

Author: Ahn Kwangjun
Publication venue
Publication date: 03/06/2020
Field of study

The proximal point method (PPM) is a fundamental method in optimization that is often used as a building block for fast optimization algorithms. In this work, building on a recent work by Defazio (2019), we provide a complete understanding of Nesterov's accelerated gradient method (AGM) by establishing quantitative and analytical connections between PPM and AGM. The main observation in this paper is that AGM is in fact equal to a simple approximation of PPM, which results in an elementary derivation of the mysterious updates of AGM as well as its step sizes. This connection also leads to a conceptually simple analysis of AGM based on the standard analysis of PPM. This view naturally extends to the strongly convex case and also motivates other accelerated methods for practically relevant settings.Comment: 14 pages; Section 4 updated; Remark 5 added; comments would be appreciated

arXiv.org e-Print Archive