67 research outputs found

    An accelerated first-order regularized momentum descent ascent algorithm for stochastic nonconvex-concave minimax problems

    Full text link
    Stochastic nonconvex minimax problems have attracted wide attention in machine learning, signal processing and many other fields in recent years. In this paper, we propose an accelerated first-order regularized momentum descent ascent algorithm (FORMDA) for solving stochastic nonconvex-concave minimax problems. The iteration complexity of the algorithm is proved to be O~(ε−6.5)\tilde{\mathcal{O}}(\varepsilon ^{-6.5}) to obtain an ε\varepsilon-stationary point, which achieves the best-known complexity bound for single-loop algorithms to solve the stochastic nonconvex-concave minimax problems under the stationarity of the objective function

    Zeroth-Order Alternating Gradient Descent Ascent Algorithms for a Class of Nonconvex-Nonconcave Minimax Problems

    Full text link
    In this paper, we consider a class of nonconvex-nonconcave minimax problems, i.e., NC-PL minimax problems, whose objective functions satisfy the Polyak-\Lojasiewicz (PL) condition with respect to the inner variable. We propose a zeroth-order alternating gradient descent ascent (ZO-AGDA) algorithm and a zeroth-order variance reduced alternating gradient descent ascent (ZO-VRAGDA) algorithm for solving NC-PL minimax problem under the deterministic and the stochastic setting, respectively. The number of iterations to obtain an ϵ\epsilon-stationary point of ZO-AGDA and ZO-VRAGDA algorithm for solving NC-PL minimax problem is upper bounded by O(ε−2)\mathcal{O}(\varepsilon^{-2}) and O(ε−3)\mathcal{O}(\varepsilon^{-3}), respectively. To the best of our knowledge, they are the first two zeroth-order algorithms with the iteration complexity gurantee for solving NC-PL minimax problems

    Primal Dual Alternating Proximal Gradient Algorithms for Nonsmooth Nonconvex Minimax Problems with Coupled Linear Constraints

    Full text link
    Nonconvex minimax problems have attracted wide attention in machine learning, signal processing and many other fields in recent years. In this paper, we propose a primal dual alternating proximal gradient (PDAPG) algorithm and a primal dual proximal gradient (PDPG-L) algorithm for solving nonsmooth nonconvex-strongly concave and nonconvex-linear minimax problems with coupled linear constraints, respectively. The corresponding iteration complexity of the two algorithms are proved to be O(ε−2)\mathcal{O}\left( \varepsilon ^{-2} \right) and O(ε−3)\mathcal{O}\left( \varepsilon ^{-3} \right) to reach an ε\varepsilon-stationary point, respectively. To our knowledge, they are the first two algorithms with iteration complexity guarantee for solving the two classes of minimax problems

    Adaptive Federated Minimax Optimization with Lower complexities

    Full text link
    Federated learning is a popular distributed and privacy-preserving machine learning paradigm. Meanwhile, minimax optimization, as an effective hierarchical optimization, is widely applied in machine learning. Recently, some federated optimization methods have been proposed to solve the distributed minimax problems. However, these federated minimax methods still suffer from high gradient and communication complexities. Meanwhile, few algorithm focuses on using adaptive learning rate to accelerate algorithms. To fill this gap, in the paper, we study a class of nonconvex minimax optimization, and propose an efficient adaptive federated minimax optimization algorithm (i.e., AdaFGDA) to solve these distributed minimax problems. Specifically, our AdaFGDA builds on the momentum-based variance reduced and local-SGD techniques, and it can flexibly incorporate various adaptive learning rates by using the unified adaptive matrix. Theoretically, we provide a solid convergence analysis framework for our AdaFGDA algorithm under non-i.i.d. setting. Moreover, we prove our algorithms obtain lower gradient (i.e., stochastic first-order oracle, SFO) complexity of O~(ϵ−3)\tilde{O}(\epsilon^{-3}) with lower communication complexity of O~(ϵ−2)\tilde{O}(\epsilon^{-2}) in finding ϵ\epsilon-stationary point of the nonconvex minimax problems. Experimentally, we conduct some experiments on the deep AUC maximization and robust neural network training tasks to verify efficiency of our algorithms.Comment: Submitted to AISTATS-202

    Efficient Cross-Device Federated Learning Algorithms for Minimax Problems

    Full text link
    In many machine learning applications where massive and privacy-sensitive data are generated on numerous mobile or IoT devices, collecting data in a centralized location may be prohibitive. Thus, it is increasingly attractive to estimate parameters over mobile or IoT devices while keeping data localized. Such learning setting is known as cross-device federated learning. In this paper, we propose the first theoretically guaranteed algorithms for general minimax problems in the cross-device federated learning setting. Our algorithms require only a fraction of devices in each round of training, which overcomes the difficulty introduced by the low availability of devices. The communication overhead is further reduced by performing multiple local update steps on clients before communication with the server, and global gradient estimates are leveraged to correct the bias in local update directions introduced by data heterogeneity. By developing analyses based on novel potential functions, we establish theoretical convergence guarantees for our algorithms. Experimental results on AUC maximization, robust adversarial network training, and GAN training tasks demonstrate the efficiency of our algorithms
    • …
    corecore