Search CORE

17 research outputs found

On the Global Convergence of Continuous-Time Stochastic Heavy-Ball Method for Nonconvex Optimization

Author: Hu Wenqing
Li Chris Junchi
Zhou Xiang
Publication venue
Publication date: 18/10/2019
Field of study

We study the convergence behavior of the stochastic heavy-ball method with a small stepsize. Under a change of time scale, we approximate the discrete method by a stochastic differential equation that models small random perturbations of a coupled system of nonlinear oscillators. We rigorously show that the perturbed system converges to a local minimum in a logarithmic time. This indicates that for the diffusion process that approximates the stochastic heavy-ball method, it takes (up to a logarithmic factor) only a linear time of the square root of the inverse stepsize to escape from all saddle points. This results may suggest a fast convergence of its discrete-time counterpart. Our theoretical results are validated by numerical experiments.Comment: accepted at IEEE International Conference on Big Data in 201

arXiv.org e-Print Archive

Crossref

Missouri University of Science and Technology (Missouri S&T): Scholars' Mine

On the fast convergence of random perturbations of the gradient flow

Author: Hu Wenqing
Li Chris Junchi
Yang Jiaojiao
Publication venue
Publication date: 27/04/2020
Field of study

We consider in this work small random perturbations (of multiplicative noise type) of the gradient flow. We prove that under mild conditions, when the potential function is a Morse function with additional strong saddle condition, the perturbed gradient flow converges to the neighborhood of local minimizers in

O(\ln (\varepsilon^{-1}))

time on the average, where

\varepsilon

is the scale of the random perturbation. Under a change of time scale, this indicates that for the diffusion process that approximates the stochastic gradient method, it takes (up to logarithmic factor) only a linear time of inverse stepsize to evade from all saddle points. This can be regarded as a manifestation of fast convergence of the discrete-time stochastic gradient method, the latter being used heavily in modern statistical machine learning.Comment: Revise and Resubmit at Asymptotic Analysi

arXiv.org e-Print Archive

Missouri University of Science and Technology (Missouri S&T): Scholars' Mine

Diffusion Approximations for Online Principal Component Estimation and Global Convergence

Author: Li Chris Junchi
Liu Han
Wang Mengdi
Zhang Tong
Publication venue
Publication date: 01/01/2017
Field of study

In this paper, we propose to adopt the diffusion approximation tools to study the dynamics of Oja's iteration which is an online stochastic gradient descent method for the principal component analysis. Oja's iteration maintains a running estimate of the true principal component from streaming data and enjoys less temporal and spatial complexities. We show that the Oja's iteration for the top eigenvector generates a continuous-state discrete-time Markov chain over the unit sphere. We characterize the Oja's iteration in three phases using diffusion approximation and weak convergence tools. Our three-phase analysis further provides a finite-sample error bound for the running estimate, which matches the minimax information lower bound for principal component analysis under the additional assumption of bounded samples.Comment: Appeared in NIPS 201

arXiv.org e-Print Archive

Princeton University Open Access Repository