19,451 research outputs found
A Geometric Variational Approach to Bayesian Inference
We propose a novel Riemannian geometric framework for variational inference
in Bayesian models based on the nonparametric Fisher-Rao metric on the manifold
of probability density functions. Under the square-root density representation,
the manifold can be identified with the positive orthant of the unit
hypersphere in L2, and the Fisher-Rao metric reduces to the standard L2 metric.
Exploiting such a Riemannian structure, we formulate the task of approximating
the posterior distribution as a variational problem on the hypersphere based on
the alpha-divergence. This provides a tighter lower bound on the marginal
distribution when compared to, and a corresponding upper bound unavailable
with, approaches based on the Kullback-Leibler divergence. We propose a novel
gradient-based algorithm for the variational problem based on Frechet
derivative operators motivated by the geometry of the Hilbert sphere, and
examine its properties. Through simulations and real-data applications, we
demonstrate the utility of the proposed geometric framework and algorithm on
several Bayesian models
Diffusion Approximations for Online Principal Component Estimation and Global Convergence
In this paper, we propose to adopt the diffusion approximation tools to study
the dynamics of Oja's iteration which is an online stochastic gradient descent
method for the principal component analysis. Oja's iteration maintains a
running estimate of the true principal component from streaming data and enjoys
less temporal and spatial complexities. We show that the Oja's iteration for
the top eigenvector generates a continuous-state discrete-time Markov chain
over the unit sphere. We characterize the Oja's iteration in three phases using
diffusion approximation and weak convergence tools. Our three-phase analysis
further provides a finite-sample error bound for the running estimate, which
matches the minimax information lower bound for principal component analysis
under the additional assumption of bounded samples.Comment: Appeared in NIPS 201
From Proximal Point Method to Nesterov's Acceleration
The proximal point method (PPM) is a fundamental method in optimization that
is often used as a building block for fast optimization algorithms. In this
work, building on a recent work by Defazio (2019), we provide a complete
understanding of Nesterov's accelerated gradient method (AGM) by establishing
quantitative and analytical connections between PPM and AGM. The main
observation in this paper is that AGM is in fact equal to a simple
approximation of PPM, which results in an elementary derivation of the
mysterious updates of AGM as well as its step sizes. This connection also leads
to a conceptually simple analysis of AGM based on the standard analysis of PPM.
This view naturally extends to the strongly convex case and also motivates
other accelerated methods for practically relevant settings.Comment: 14 pages; Section 4 updated; Remark 5 added; comments would be
appreciated
- …