Search CORE

1,859 research outputs found

ReSQueing Parallel and Private Stochastic Convex Optimization

Author: Carmon Yair
Jambulapati Arun
Jin Yujia
Lee Yin Tat
Liu Daogao
Sidford Aaron
Tian Kevin
Publication venue
Publication date: 27/10/2023
Field of study

We introduce a new tool for stochastic convex optimization (SCO): a Reweighted Stochastic Query (ReSQue) estimator for the gradient of a function convolved with a (Gaussian) probability density. Combining ReSQue with recent advances in ball oracle acceleration [CJJJLST20, ACJJS21], we develop algorithms achieving state-of-the-art complexities for SCO in parallel and private settings. For a SCO objective constrained to the unit ball in

\mathbb{R}^d

, we obtain the following results (up to polylogarithmic factors). We give a parallel algorithm obtaining optimization error

\epsilon_{\text{opt}}

with

d^{1/3}\epsilon_{\text{opt}}^{-2/3}

gradient oracle query depth and

d^{1/3}\epsilon_{\text{opt}}^{-2/3} + \epsilon_{\text{opt}}^{-2}

gradient queries in total, assuming access to a bounded-variance stochastic gradient estimator. For

\epsilon_{\text{opt}} \in [d^{-1}, d^{-1/4}]

, our algorithm matches the state-of-the-art oracle depth of [BJLLS19] while maintaining the optimal total work of stochastic gradient descent. Given

n

samples of Lipschitz loss functions, prior works [BFTT19, BFGT20, AFKT21, KLL21] established that if

n \gtrsim d \epsilon_{\text{dp}}^{-2}

(\epsilon_{\text{dp}}, \delta)

-differential privacy is attained at no asymptotic cost to the SCO utility. However, these prior works all required a superlinear number of gradient queries. We close this gap for sufficiently large

n \gtrsim d^2 \epsilon_{\text{dp}}^{-3}

, by using ReSQue to design an algorithm with near-linear gradient query complexity in this regime

arXiv.org e-Print Archive

From Averaging to Acceleration, There is Only a Step-size

Author: Bach Francis
Flammarion Nicolas
Publication venue
Publication date: 01/01/2015
Field of study

We show that accelerated gradient descent, averaged gradient descent and the heavy-ball method for non-strongly-convex problems may be reformulated as constant parameter second-order difference equation algorithms, where stability of the system is equivalent to convergence at rate O(1/n 2), where n is the number of iterations. We provide a detailed analysis of the eigenvalues of the corresponding linear dynamical system , showing various oscillatory and non-oscillatory behaviors, together with a sharp stability result with explicit constants. We also consider the situation where noisy gradients are available, where we extend our general convergence result, which suggests an alternative algorithm (i.e., with different step sizes) that exhibits the good aspects of both averaging and acceleration

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server