42 research outputs found

    SVAG: Stochastic Variance Adjusted Gradient Descent and Biased Stochastic Gradients

    Full text link
    We examine biased gradient updates in variance reduced stochastic gradient methods. For this purpose we introduce SVAG, a SAG/SAGA-like method with adjustable bias. SVAG is analyzed under smoothness assumptions and we provide step-size conditions for convergence that match or improve on previously known conditions for SAG and SAGA. The analysis highlights a step-size requirement difference between when SVAG is applied to cocoercive operators and when applied to gradients of smooth functions, a difference not present in ordinary gradient descent. This difference is verified with numerical experiments. A variant of SVAG that adaptively selects the bias is presented and compared numerically to SVAG on a set of classification problems. The adaptive SVAG frequently performs among the best and always improves on the worst-case performance of the non-adaptive variant

    Zeroth order optimization with orthogonal random directions

    Get PDF
    We propose and analyze a randomized zeroth-order approach based on approximating the exact gradient by finite differences computed in a set of orthogonal random directions that changes with each iteration. A number of previously proposed methods are recovered as special cases including spherical smoothing, coordinat edescent, as well as discretized gradient descent. Our main contribution is proving convergence guarantees as well as convergence rates under different parameter choices and assumptions. In particular, we consider convex objectives, but also possibly non-convex objectives satisfying thePolyak-Ɓojasiewicz (PL) condition. Theoretical results are complemented and illustrated by numerical experiments

    On the convergence of stochastic forward-backward-forward algorithms with variance reduction in pseudo-monotone variational inequalities

    Get PDF
    International audienceWe develop a new stochastic algorithm with variance reduction for solving pseudo-monotone stochastic variational inequalities. Our method builds on Tseng's forward-backward-forward algorithm, which is known in the deterministic literature to be a valuable alternative to Korpelevich's extragradient method when solving variational inequalities over a convex and closed set governed with pseudo-monotone and Lipschitz continuous operators. The main computational advantage of Tseng's algorithm is that it relies only on a single projection step, and two independent queries of a stochastic oracle. Our algorithm incorporates a variance reduction mechanism, and leads to a.s. convergence to solutions of a merely pseudo-monotone stochastic variational inequality problem. To the best of our knowledge, this is the first stochastic algorithm achieving this by using only a single projection at each iteration

    Model Consistency for Learning with Mirror-Stratifiable Regularizers

    Get PDF
    Low-complexity non-smooth convex regularizers are routinely used to impose some structure (such as sparsity or low-rank) on the coefficients for linear predictors in supervised learning. Model consistency consists then in selecting the correct structure (for instance support or rank) by regularized empirical risk minimization. It is known that model consistency holds under appropriate non-degeneracy conditions. However such conditions typically fail for highly correlated designs and it is observed that regularization methods tend to select larger models. In this work, we provide the theoretical underpinning of this behavior using the notion of mirror-stratifiable regularizers. This class of regularizers encompasses the most well-known in the literature, including the ℓ1\ell_1 or trace norms. It brings into play a pair of primal-dual models, which in turn allows one to locate the structure of the solution using a specific dual certificate. We also show how this analysis is applicable to optimal solutions of the learning problem, and also to the iterates computed by a certain class of stochastic proximal-gradient algorithms.Comment: 14 pages, 4 figure

    An Alternating Direction Method of Multipliers for Constrained Joint Diagonalization by Congruence (Invited Paper)

    Get PDF
    International audienceIn this paper, we address the problem of joint diagonalization by congruence (i.e. the canonical polyadic decomposition of semi-symmetric 3rd order tensors) subject to arbitrary convex constraints. Sufficient conditions for the existence of a solution are given. An efficient algorithm based on the Alternating Direction Method of Multipliers (ADMM) is then designed. ADMM provides an elegant approach for handling the additional constraint terms, while taking advantage of the structure of the objective function. Numerical tests on simulated matrices show the benefits of the proposed method for low signal to noise ratios. Simulations in the context of nuclear magnetic resonance spectroscopy are also provided
    corecore