31 research outputs found

    Two trust region type algorithms for solving nonconvex-strongly concave minimax problems

    Full text link
    In this paper, we propose a Minimax Trust Region (MINIMAX-TR) algorithm and a Minimax Trust Region Algorithm with Contractions and Expansions(MINIMAX-TRACE) algorithm for solving nonconvex-strongly concave minimax problems. Both algorithms can find an (ϵ,ϵ)(\epsilon, \sqrt{\epsilon})-second order stationary point(SSP) within O(ϵ1.5)\mathcal{O}(\epsilon^{-1.5}) iterations, which matches the best well known iteration complexity

    Alternating proximal-gradient steps for (stochastic) nonconvex-concave minimax problems

    Full text link
    Minimax problems of the form minxmaxyΨ(x,y)\min_x \max_y \Psi(x,y) have attracted increased interest largely due to advances in machine learning, in particular generative adversarial networks. These are typically trained using variants of stochastic gradient descent for the two players. Although convex-concave problems are well understood with many efficient solution methods to choose from, theoretical guarantees outside of this setting are sometimes lacking even for the simplest algorithms. In particular, this is the case for alternating gradient descent ascent, where the two agents take turns updating their strategies. To partially close this gap in the literature we prove a novel global convergence rate for the stochastic version of this method for finding a critical point of g():=maxyΨ(,y)g(\cdot) := \max_y \Psi(\cdot,y) in a setting which is not convex-concave

    Gradient Descent Ascent for Min-Max Problems on Riemannian Manifolds

    Full text link
    In the paper, we study a class of useful non-convex minimax optimization problems on Riemanian manifolds and propose a class of Riemanian gradient descent ascent algorithms to solve these minimax problems. Specifically, we propose a new Riemannian gradient descent ascent (RGDA) algorithm for the \textbf{deterministic} minimax optimization. Moreover, we prove that the RGDA has a sample complexity of O(κ2ϵ2)O(\kappa^2\epsilon^{-2}) for finding an ϵ\epsilon-stationary point of the nonconvex strongly-concave minimax problems, where κ\kappa denotes the condition number. At the same time, we introduce a Riemannian stochastic gradient descent ascent (RSGDA) algorithm for the \textbf{stochastic} minimax optimization. In the theoretical analysis, we prove that the RSGDA can achieve a sample complexity of O(κ3ϵ4)O(\kappa^3\epsilon^{-4}). To further reduce the sample complexity, we propose a novel momentum variance-reduced Riemannian stochastic gradient descent ascent (MVR-RSGDA) algorithm based on the momentum-based variance-reduced technique of STORM. We prove that the MVR-RSGDA algorithm achieves a lower sample complexity of O~(κ(3ν/2)ϵ3)\tilde{O}(\kappa^{(3-\nu/2)}\epsilon^{-3}) for ν0\nu \geq 0, which reaches the best known sample complexity for its Euclidean counterpart. Extensive experimental results on the robust deep neural networks training over Stiefel manifold demonstrate the efficiency of our proposed algorithms.Comment: 32 pages. We have updated the theoretical results of our methods in this new revision. E.g., our MVR-RSGDA algorithm achieves a lower sample complexity. arXiv admin note: text overlap with arXiv:2008.0817

    Semi-Anchored Multi-Step Gradient Descent Ascent Method for Structured Nonconvex-Nonconcave Composite Minimax Problems

    Full text link
    Minimax problems, such as generative adversarial network, adversarial training, and fair training, are widely solved by a multi-step gradient descent ascent (MGDA) method in practice. However, its convergence guarantee is limited. In this paper, inspired by the primal-dual hybrid gradient method, we propose a new semi-anchoring (SA) technique for the MGDA method. This makes the MGDA method find a stationary point of a structured nonconvex-nonconcave composite minimax problem; its saddle-subdifferential operator satisfies the weak Minty variational inequality condition. The resulting method, named SA-MGDA, is built upon a Bregman proximal point method. We further develop its backtracking line-search version, and its non-Euclidean version for smooth adaptable functions. Numerical experiments, including a fair classification training, are provided
    corecore