277 research outputs found

    Extra-Newton: A First Approach to Noise-Adaptive Accelerated Second-Order Methods

    Full text link
    This work proposes a universal and adaptive second-order method for minimizing second-order smooth, convex functions. Our algorithm achieves O(σ/T)O(\sigma / \sqrt{T}) convergence when the oracle feedback is stochastic with variance σ2\sigma^2, and improves its convergence to O(1/T3)O( 1 / T^3) with deterministic oracles, where TT is the number of iterations. Our method also interpolates these rates without knowing the nature of the oracle apriori, which is enabled by a parameter-free adaptive step-size that is oblivious to the knowledge of smoothness modulus, variance bounds and the diameter of the constrained set. To our knowledge, this is the first universal algorithm with such global guarantees within the second-order optimization literature.Comment: 32 pages, 4 figures, accepted at NeurIPS 202

    Multipath Characterization of Indoor Power-Line Networks

    Full text link

    Adaptive first-order methods revisited: Convex optimization without Lipschitz requirements

    Get PDF
    International audienceWe propose a new family of adaptive first-order methods for a class of convex minimization problems that may fail to be Lipschitz continuous or smooth in the standard sense. Specifically, motivated by a recent flurry of activity on non-Lipschitz (NoLips) optimization, we consider problems that are continuous or smooth relative to a reference Bregman function-as opposed to a global, ambient norm (Euclidean or otherwise). These conditions encompass a wide range of problems with singular objective, such as Fisher markets, Poisson tomography, D-design, and the like. In this setting, the application of existing order-optimal adaptive methods-like UnixGrad or AcceleGrad-is not possible, especially in the presence of randomness and uncertainty. The proposed method, adaptive mirror descent (AdaMir), aims to close this gap by concurrently achieving min-max optimal rates in problems that are relatively continuous or smooth, including stochastic ones

    Distributed Extra-gradient with Optimal Complexity and Communication Guarantees

    Full text link
    We consider monotone variational inequality (VI) problems in multi-GPU settings where multiple processors/workers/clients have access to local stochastic dual vectors. This setting includes a broad range of important problems from distributed convex minimization to min-max and games. Extra-gradient, which is a de facto algorithm for monotone VI problems, has not been designed to be communication-efficient. To this end, we propose a quantized generalized extra-gradient (Q-GenX), which is an unbiased and adaptive compression method tailored to solve VIs. We provide an adaptive step-size rule, which adapts to the respective noise profiles at hand and achieve a fast rate of O(1/T){\mathcal O}(1/T) under relative noise, and an order-optimal O(1/T){\mathcal O}(1/\sqrt{T}) under absolute noise and show distributed training accelerates convergence. Finally, we validate our theoretical results by providing real-world experiments and training generative adversarial networks on multiple GPUs.Comment: International Conference on Learning Representations (ICLR 2023

    Advancing the lower bounds: An accelerated, stochastic, second-order method with optimal adaptation to inexactness

    Full text link
    We present a new accelerated stochastic second-order method that is robust to both gradient and Hessian inexactness, which occurs typically in machine learning. We establish theoretical lower bounds and prove that our algorithm achieves optimal convergence in both gradient and Hessian inexactness in this key setting. We further introduce a tensor generalization for stochastic higher-order derivatives. When the oracles are non-stochastic, the proposed tensor algorithm matches the global convergence of Nesterov Accelerated Tensor method. Both algorithms allow for approximate solutions of their auxiliary subproblems with verifiable conditions on the accuracy of the solution
    corecore