88 research outputs found

    A Stochastic Interpretation of Stochastic Mirror Descent: Risk-Sensitive Optimality

    Get PDF
    Stochastic mirror descent (SMD) is a fairly new family of algorithms that has recently found a wide range of applications in optimization, machine learning, and control. It can be considered a generalization of the classical stochastic gradient algorithm (SGD), where instead of updating the weight vector along the negative direction of the stochastic gradient, the update is performed in a "mirror domain" defined by the gradient of a (strictly convex) potential function. This potential function, and the mirror domain it yields, provides considerable flexibility in the algorithm compared to SGD. While many properties of SMD have already been obtained in the literature, in this paper we exhibit a new interpretation of SMD, namely that it is a risk-sensitive optimal estimator when the unknown weight vector and additive noise are non-Gaussian and belong to the exponential family of distributions. The analysis also suggests a modified version of SMD, which we refer to as symmetric SMD (SSMD). The proofs rely on some simple properties of Bregman divergence, which allow us to extend results from quadratics and Gaussians to certain convex functions and exponential families in a rather seamless way

    Design of First-Order Optimization Algorithms via Sum-of-Squares Programming

    Full text link
    In this paper, we propose a framework based on sum-of-squares programming to design iterative first-order optimization algorithms for smooth and strongly convex problems. Our starting point is to develop a polynomial matrix inequality as a sufficient condition for exponential convergence of the algorithm. The entries of this matrix are polynomial functions of the unknown parameters (exponential decay rate, stepsize, momentum coefficient, etc.). We then formulate a polynomial optimization, in which the objective is to optimize the exponential decay rate over the parameters of the algorithm. Finally, we use sum-of-squares programming as a tractable relaxation of the proposed polynomial optimization problem. We illustrate the utility of the proposed framework by designing a first-order algorithm that shares the same structure as Nesterov's accelerated gradient method

    An SDP Approach For Solving Quadratic Fractional Programming Problems

    Full text link
    This paper considers a fractional programming problem (P) which minimizes a ratio of quadratic functions subject to a two-sided quadratic constraint. As is well-known, the fractional objective function can be replaced by a parametric family of quadratic functions, which makes (P) highly related to, but more difficult than a single quadratic programming problem subject to a similar constraint set. The task is to find the optimal parameter λ∗\lambda^* and then look for the optimal solution if λ∗\lambda^* is attained. Contrasted with the classical Dinkelbach method that iterates over the parameter, we propose a suitable constraint qualification under which a new version of the S-lemma with an equality can be proved so as to compute λ∗\lambda^* directly via an exact SDP relaxation. When the constraint set of (P) is degenerated to become an one-sided inequality, the same SDP approach can be applied to solve (P) {\it without any condition}. We observe that the difference between a two-sided problem and an one-sided problem lies in the fact that the S-lemma with an equality does not have a natural Slater point to hold, which makes the former essentially more difficult than the latter. This work does not, either, assume the existence of a positive-definite linear combination of the quadratic terms (also known as the dual Slater condition, or a positive-definite matrix pencil), our result thus provides a novel extension to the so-called "hard case" of the generalized trust region subproblem subject to the upper and the lower level set of a quadratic function.Comment: 26 page
    • …
    corecore