164 research outputs found

    Approximate Convex Optimization by Online Game Playing

    Full text link
    Lagrangian relaxation and approximate optimization algorithms have received much attention in the last two decades. Typically, the running time of these methods to obtain a ϵ\epsilon approximate solution is proportional to 1ϵ2\frac{1}{\epsilon^2}. Recently, Bienstock and Iyengar, following Nesterov, gave an algorithm for fractional packing linear programs which runs in 1ϵ\frac{1}{\epsilon} iterations. The latter algorithm requires to solve a convex quadratic program every iteration - an optimization subroutine which dominates the theoretical running time. We give an algorithm for convex programs with strictly convex constraints which runs in time proportional to 1ϵ\frac{1}{\epsilon}. The algorithm does NOT require to solve any quadratic program, but uses gradient steps and elementary operations only. Problems which have strictly convex constraints include maximum entropy frequency estimation, portfolio optimization with loss risk constraints, and various computational problems in signal processing. As a side product, we also obtain a simpler version of Bienstock and Iyengar's result for general linear programming, with similar running time. We derive these algorithms using a new framework for deriving convex optimization algorithms from online game playing algorithms, which may be of independent interest

    Almost Optimal Sublinear Time Algorithm for Semidefinite Programming

    Full text link
    We present an algorithm for approximating semidefinite programs with running time that is sublinear in the number of entries in the semidefinite instance. We also present lower bounds that show our algorithm to have a nearly optimal running time

    Faster Rates for the Frank-Wolfe Method over Strongly-Convex Sets

    Full text link
    The Frank-Wolfe method (a.k.a. conditional gradient algorithm) for smooth optimization has regained much interest in recent years in the context of large scale optimization and machine learning. A key advantage of the method is that it avoids projections - the computational bottleneck in many applications - replacing it by a linear optimization step. Despite this advantage, the known convergence rates of the FW method fall behind standard first order methods for most settings of interest. It is an active line of research to derive faster linear optimization-based algorithms for various settings of convex optimization. In this paper we consider the special case of optimization over strongly convex sets, for which we prove that the vanila FW method converges at a rate of 1t2\frac{1}{t^2}. This gives a quadratic improvement in convergence rate compared to the general case, in which convergence is of the order 1t\frac{1}{t}, and known to be tight. We show that various balls induced by p\ell_p norms, Schatten norms and group norms are strongly convex on one hand and on the other hand, linear optimization over these sets is straightforward and admits a closed-form solution. We further show how several previous fast-rate results for the FW method follow easily from our analysis

    Universal MMSE Filtering With Logarithmic Adaptive Regret

    Full text link
    We consider the problem of online estimation of a real-valued signal corrupted by oblivious zero-mean noise using linear estimators. The estimator is required to iteratively predict the underlying signal based on the current and several last noisy observations, and its performance is measured by the mean-square-error. We describe and analyze an algorithm for this task which: 1. Achieves logarithmic adaptive regret against the best linear filter in hindsight. This bound is assyptotically tight, and resolves the question of Moon and Weissman [1]. 2. Runs in linear time in terms of the number of filter coefficients. Previous constructions required at least quadratic time.Comment: 14 page

    Variance-Reduced and Projection-Free Stochastic Optimization

    Full text link
    The Frank-Wolfe optimization algorithm has recently regained popularity for machine learning applications due to its projection-free property and its ability to handle structured constraints. However, in the stochastic learning setting, it is still relatively understudied compared to the gradient descent counterpart. In this work, leveraging a recent variance reduction technique, we propose two stochastic Frank-Wolfe variants which substantially improve previous results in terms of the number of stochastic gradient evaluations needed to achieve 1ϵ1-\epsilon accuracy. For example, we improve from O(1ϵ)O(\frac{1}{\epsilon}) to O(ln1ϵ)O(\ln\frac{1}{\epsilon}) if the objective function is smooth and strongly convex, and from O(1ϵ2)O(\frac{1}{\epsilon^2}) to O(1ϵ1.5)O(\frac{1}{\epsilon^{1.5}}) if the objective function is smooth and Lipschitz. The theoretical improvement is also observed in experiments on real-world datasets for a multiclass classification application
    corecore