93 research outputs found

    Solving Variational Inequalities with Monotone Operators on Domains Given by Linear Minimization Oracles

    Full text link
    The standard algorithms for solving large-scale convex-concave saddle point problems, or, more generally, variational inequalities with monotone operators, are proximal type algorithms which at every iteration need to compute a prox-mapping, that is, to minimize over problem's domain XX the sum of a linear form and the specific convex distance-generating function underlying the algorithms in question. Relative computational simplicity of prox-mappings, which is the standard requirement when implementing proximal algorithms, clearly implies the possibility to equip XX with a relatively computationally cheap Linear Minimization Oracle (LMO) able to minimize over XX linear forms. There are, however, important situations where a cheap LMO indeed is available, but where no proximal setup with easy-to-compute prox-mappings is known. This fact motivates our goal in this paper, which is to develop techniques for solving variational inequalities with monotone operators on domains given by Linear Minimization Oracles. The techniques we develope can be viewed as a substantial extension of the proposed in [5] method of nonsmooth convex minimization over an LMO-represented domain

    Decomposition Techniques for Bilinear Saddle Point Problems and Variational Inequalities with Affine Monotone Operators on Domains Given by Linear Minimization Oracles

    Full text link
    The majority of First Order methods for large-scale convex-concave saddle point problems and variational inequalities with monotone operators are proximal algorithms which at every iteration need to minimize over problem's domain X the sum of a linear form and a strongly convex function. To make such an algorithm practical, X should be proximal-friendly -- admit a strongly convex function with easy to minimize linear perturbations. As a byproduct, X admits a computationally cheap Linear Minimization Oracle (LMO) capable to minimize over X linear forms. There are, however, important situations where a cheap LMO indeed is available, but X is not proximal-friendly, which motivates search for algorithms based solely on LMO's. For smooth convex minimization, there exists a classical LMO-based algorithm -- Conditional Gradient. In contrast, known to us LMO-based techniques for other problems with convex structure (nonsmooth convex minimization, convex-concave saddle point problems, even as simple as bilinear ones, and variational inequalities with monotone operators, even as simple as affine) are quite recent and utilize common approach based on Fenchel-type representations of the associated objectives/vector fields. The goal of this paper is to develop an alternative (and seemingly much simpler) LMO-based decomposition techniques for bilinear saddle point problems and for variational inequalities with affine monotone operators

    Semi-proximal Mirror-Prox for Nonsmooth Composite Minimization

    Get PDF
    We propose a new first-order optimisation algorithm to solve high-dimensional non-smooth composite minimisation problems. Typical examples of such problems have an objective that decomposes into a non-smooth empirical risk part and a non-smooth regularisation penalty. The proposed algorithm, called Semi-Proximal Mirror-Prox, leverages the Fenchel-type representation of one part of the objective while handling the other part of the objective via linear minimization over the domain. The algorithm stands in contrast with more classical proximal gradient algorithms with smoothing, which require the computation of proximal operators at each iteration and can therefore be impractical for high-dimensional problems. We establish the theoretical convergence rate of Semi-Proximal Mirror-Prox, which exhibits the optimal complexity bounds, i.e. O(1/ϵ2)O(1/\epsilon^2), for the number of calls to linear minimization oracle. We present promising experimental results showing the interest of the approach in comparison to competing methods

    Frank-Wolfe Algorithms for Saddle Point Problems

    Full text link
    We extend the Frank-Wolfe (FW) optimization algorithm to solve constrained smooth convex-concave saddle point (SP) problems. Remarkably, the method only requires access to linear minimization oracles. Leveraging recent advances in FW optimization, we provide the first proof of convergence of a FW-type saddle point solver over polytopes, thereby partially answering a 30 year-old conjecture. We also survey other convergence results and highlight gaps in the theoretical underpinnings of FW-style algorithms. Motivating applications without known efficient alternatives are explored through structured prediction with combinatorial penalties as well as games over matching polytopes involving an exponential number of constraints.Comment: Appears in: Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS 2017). 39 page

    Stochastic Gradient Descent-Ascent: Unified Theory and New Efficient Methods

    Full text link
    Stochastic Gradient Descent-Ascent (SGDA) is one of the most prominent algorithms for solving min-max optimization and variational inequalities problems (VIP) appearing in various machine learning tasks. The success of the method led to several advanced extensions of the classical SGDA, including variants with arbitrary sampling, variance reduction, coordinate randomization, and distributed variants with compression, which were extensively studied in the literature, especially during the last few years. In this paper, we propose a unified convergence analysis that covers a large variety of stochastic gradient descent-ascent methods, which so far have required different intuitions, have different applications and have been developed separately in various communities. A key to our unified framework is a parametric assumption on the stochastic estimates. Via our general theoretical framework, we either recover the sharpest known rates for the known special cases or tighten them. Moreover, to illustrate the flexibility of our approach we develop several new variants of SGDA such as a new variance-reduced method (L-SVRGDA), new distributed methods with compression (QSGDA, DIANA-SGDA, VR-DIANA-SGDA), and a new method with coordinate randomization (SEGA-SGDA). Although variants of the new methods are known for solving minimization problems, they were never considered or analyzed for solving min-max problems and VIPs. We also demonstrate the most important properties of the new methods through extensive numerical experiments.Comment: 72 pages, 4 figures, 3 tables. Changes in v2: new results were added (Theorem 2.5 and its corollaries), few typos were fixed, more clarifications were added. Code: https://github.com/hugobb/sgd
    corecore