Search CORE

319 research outputs found

Convergence of the Forward-Backward Algorithm: Beyond the Worst Case with the Help of Geometry

Author: Garrigos Guillaume
Rosasco Lorenzo
Villa Silvia
Publication venue
Publication date: 01/08/2017
Field of study

We provide a comprehensive study of the convergence of forward-backward algorithm under suitable geometric conditions leading to fast rates. We present several new results and collect in a unified view a variety of results scattered in the literature, often providing simplified proofs. Novel contributions include the analysis of infinite dimensional convex minimization problems, allowing the case where minimizers might not exist. Further, we analyze the relation between different geometric conditions, and discuss novel connections with a priori conditions in linear inverse problems, including source conditions, restricted isometry properties and partial smoothness

arXiv.org e-Print Archive

A Simple Proximal Stochastic Gradient Method for Nonsmooth Nonconvex Optimization

Author: Li Jian
Li Zhize
Publication venue
Publication date: 01/12/2018
Field of study

We analyze stochastic gradient algorithms for optimizing nonconvex, nonsmooth finite-sum problems. In particular, the objective function is given by the summation of a differentiable (possibly nonconvex) component, together with a possibly non-differentiable but convex component. We propose a proximal stochastic gradient algorithm based on variance reduction, called ProxSVRG+. Our main contribution lies in the analysis of ProxSVRG+. It recovers several existing convergence results and improves/generalizes them (in terms of the number of stochastic gradient oracle calls and proximal oracle calls). In particular, ProxSVRG+ generalizes the best results given by the SCSG algorithm, recently proposed by [Lei et al., 2017] for the smooth nonconvex case. ProxSVRG+ is also more straightforward than SCSG and yields simpler analysis. Moreover, ProxSVRG+ outperforms the deterministic proximal gradient descent (ProxGD) for a wide range of minibatch sizes, which partially solves an open problem proposed in [Reddi et al., 2016b]. Also, ProxSVRG+ uses much less proximal oracle calls than ProxSVRG [Reddi et al., 2016b]. Moreover, for nonconvex functions satisfied Polyak-\L{}ojasiewicz condition, we prove that ProxSVRG+ achieves a global linear convergence rate without restart unlike ProxSVRG. Thus, it can \emph{automatically} switch to the faster linear convergence in some regions as long as the objective function satisfies the PL condition locally in these regions. ProxSVRG+ also improves ProxGD and ProxSVRG/SAGA, and generalizes the results of SCSG in this case. Finally, we conduct several experiments and the experimental results are consistent with the theoretical results.Comment: 32nd Conference on Neural Information Processing Systems (NeurIPS 2018

arXiv.org e-Print Archive

Institutional Knowledge at Singapore Management University

Convergence Rates of Stochastic Zeroth-order Gradient Descent for \L ojasiewicz Functions

Author: Feng Yasong
Wang Tianyu
Publication venue
Publication date: 19/04/2023
Field of study

We prove convergence rates of Stochastic Zeroth-order Gradient Descent (SZGD) algorithms for Lojasiewicz functions. The SZGD algorithm iterates as \begin{align*} \mathbf{x}_{t+1} = \mathbf{x}_t - \eta_t \widehat{\nabla} f (\mathbf{x}_t), \qquad t = 0,1,2,3,\cdots , \end{align*} where

f

is the objective function that satisfies the \L ojasiewicz inequality with \L ojasiewicz exponent

\theta

\eta_t

is the step size (learning rate), and

\widehat{\nabla} f (\mathbf{x}_t)

is the approximate gradient estimated using zeroth-order information only. Our results show that

\{ f (\mathbf{x}_t) - f (\mathbf{x}_\infty) \}_{t \in \mathbb{N} }

can converge faster than

\{ \| \mathbf{x}_t - \mathbf{x}_\infty \| \}_{t \in \mathbb{N} }

, regardless of whether the objective

f

is smooth or nonsmooth

arXiv.org e-Print Archive

Convergence rates for the heavy-ball continuous dynamics for non-convex optimization, under Polyak–Łojasiewicz condition

Author: Apidopoulos V.
Ginatta N.
Villa S.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2022
Field of study

Archivio istituzionale della ricerca - Università di Genova

Variance reduction techniques for stochastic proximal point algorithms

Author: Apidopoulos Vassilis
Salzo Saverio
Traoré Cheik
Villa Silvia
Publication venue
Publication date: 18/08/2023
Field of study

In the context of finite sums minimization, variance reduction techniques are widely used to improve the performance of state-of-the-art stochastic gradient methods. Their practical impact is clear, as well as their theoretical properties. Stochastic proximal point algorithms have been studied as an alternative to stochastic gradient algorithms since they are more stable with respect to the choice of the stepsize but a proper variance reduced version is missing. In this work, we propose the first study of variance reduction techniques for stochastic proximal point algorithms. We introduce a stochastic proximal version of SVRG, SAGA, and some of their variants for smooth and convex functions. We provide several convergence results for the iterates and the objective function values. In addition, under the Polyak-{\L}ojasiewicz (PL) condition, we obtain linear convergence rates for the iterates and the function values. Our numerical experiments demonstrate the advantages of the proximal variance reduction methods over their gradient counterparts, especially about the stability with respect to the choice of the step size

arXiv.org e-Print Archive