42 research outputs found
SVAG: Stochastic Variance Adjusted Gradient Descent and Biased Stochastic Gradients
We examine biased gradient updates in variance reduced stochastic gradient
methods. For this purpose we introduce SVAG, a SAG/SAGA-like method with
adjustable bias. SVAG is analyzed under smoothness assumptions and we provide
step-size conditions for convergence that match or improve on previously known
conditions for SAG and SAGA. The analysis highlights a step-size requirement
difference between when SVAG is applied to cocoercive operators and when
applied to gradients of smooth functions, a difference not present in ordinary
gradient descent. This difference is verified with numerical experiments. A
variant of SVAG that adaptively selects the bias is presented and compared
numerically to SVAG on a set of classification problems. The adaptive SVAG
frequently performs among the best and always improves on the worst-case
performance of the non-adaptive variant
Zeroth order optimization with orthogonal random directions
We propose and analyze a randomized zeroth-order approach based on approximating the exact gradient by finite differences computed in a set of orthogonal random directions that changes with each iteration. A number of previously proposed methods are recovered as special cases including spherical smoothing, coordinat edescent, as well as discretized gradient descent. Our main contribution is proving convergence guarantees as well as convergence rates under different parameter choices and assumptions. In particular, we consider convex objectives, but also possibly non-convex objectives satisfying thePolyak-Ćojasiewicz (PL) condition. Theoretical results are complemented and illustrated by numerical experiments
On the convergence of stochastic forward-backward-forward algorithms with variance reduction in pseudo-monotone variational inequalities
International audienceWe develop a new stochastic algorithm with variance reduction for solving pseudo-monotone stochastic variational inequalities. Our method builds on Tseng's forward-backward-forward algorithm, which is known in the deterministic literature to be a valuable alternative to Korpelevich's extragradient method when solving variational inequalities over a convex and closed set governed with pseudo-monotone and Lipschitz continuous operators. The main computational advantage of Tseng's algorithm is that it relies only on a single projection step, and two independent queries of a stochastic oracle. Our algorithm incorporates a variance reduction mechanism, and leads to a.s. convergence to solutions of a merely pseudo-monotone stochastic variational inequality problem. To the best of our knowledge, this is the first stochastic algorithm achieving this by using only a single projection at each iteration
Model Consistency for Learning with Mirror-Stratifiable Regularizers
Low-complexity non-smooth convex regularizers are routinely used to impose
some structure (such as sparsity or low-rank) on the coefficients for linear
predictors in supervised learning. Model consistency consists then in selecting
the correct structure (for instance support or rank) by regularized empirical
risk minimization.
It is known that model consistency holds under appropriate non-degeneracy
conditions. However such conditions typically fail for highly correlated
designs and it is observed that regularization methods tend to select larger
models.
In this work, we provide the theoretical underpinning of this behavior using
the notion of mirror-stratifiable regularizers. This class of regularizers
encompasses the most well-known in the literature, including the or
trace norms. It brings into play a pair of primal-dual models, which in turn
allows one to locate the structure of the solution using a specific dual
certificate.
We also show how this analysis is applicable to optimal solutions of the
learning problem, and also to the iterates computed by a certain class of
stochastic proximal-gradient algorithms.Comment: 14 pages, 4 figure
An Alternating Direction Method of Multipliers for Constrained Joint Diagonalization by Congruence (Invited Paper)
International audienceIn this paper, we address the problem of joint diagonalization by congruence (i.e. the canonical polyadic decomposition of semi-symmetric 3rd order tensors) subject to arbitrary convex constraints. Sufficient conditions for the existence of a solution are given. An efficient algorithm based on the Alternating Direction Method of Multipliers (ADMM) is then designed. ADMM provides an elegant approach for handling the additional constraint terms, while taking advantage of the structure of the objective function. Numerical tests on simulated matrices show the benefits of the proposed method for low signal to noise ratios. Simulations in the context of nuclear magnetic resonance spectroscopy are also provided