1,036 research outputs found

    Dual Averaging Method for Online Graph-structured Sparsity

    Full text link
    Online learning algorithms update models via one sample per iteration, thus efficient to process large-scale datasets and useful to detect malicious events for social benefits, such as disease outbreak and traffic congestion on the fly. However, existing algorithms for graph-structured models focused on the offline setting and the least square loss, incapable for online setting, while methods designed for online setting cannot be directly applied to the problem of complex (usually non-convex) graph-structured sparsity model. To address these limitations, in this paper we propose a new algorithm for graph-structured sparsity constraint problems under online setting, which we call \textsc{GraphDA}. The key part in \textsc{GraphDA} is to project both averaging gradient (in dual space) and primal variables (in primal space) onto lower dimensional subspaces, thus capturing the graph-structured sparsity effectively. Furthermore, the objective functions assumed here are generally convex so as to handle different losses for online learning settings. To the best of our knowledge, \textsc{GraphDA} is the first online learning algorithm for graph-structure constrained optimization problems. To validate our method, we conduct extensive experiments on both benchmark graph and real-world graph datasets. Our experiment results show that, compared to other baseline methods, \textsc{GraphDA} not only improves classification performance, but also successfully captures graph-structured features more effectively, hence stronger interpretability.Comment: 11 pages, 14 figure

    Risk Minimization, Regret Minimization and Progressive Hedging Algorithms

    Get PDF
    This paper begins with a study on the dual representations of risk and regret measures and their impact on modeling multistage decision making under uncertainty. A relationship between risk envelopes and regret envelopes is established by using the Lagrangian duality theory. Such a relationship opens a door to a decomposition scheme, called progressive hedging, for solving multistage risk minimization and regret minimization problems. In particular, the classical progressive hedging algorithm is modified in order to handle a new class of linkage constraints that arises from reformulations and other applications of risk and regret minimization problems. Numerical results are provided to show the efficiency of the progressive hedging algorithms.Comment: 21 pages, 2 figure

    Oracle-Based Robust Optimization via Online Learning

    Full text link
    Robust optimization is a common framework in optimization under uncertainty when the problem parameters are not known, but it is rather known that the parameters belong to some given uncertainty set. In the robust optimization framework the problem solved is a min-max problem where a solution is judged according to its performance on the worst possible realization of the parameters. In many cases, a straightforward solution of the robust optimization problem of a certain type requires solving an optimization problem of a more complicated type, and in some cases even NP-hard. For example, solving a robust conic quadratic program, such as those arising in robust SVM, ellipsoidal uncertainty leads in general to a semidefinite program. In this paper we develop a method for approximately solving a robust optimization problem using tools from online convex optimization, where in every stage a standard (non-robust) optimization program is solved. Our algorithms find an approximate robust solution using a number of calls to an oracle that solves the original (non-robust) problem that is inversely proportional to the square of the target accuracy

    A Unified View of Large-scale Zero-sum Equilibrium Computation

    Full text link
    The task of computing approximate Nash equilibria in large zero-sum extensive-form games has received a tremendous amount of attention due mainly to the Annual Computer Poker Competition. Immediately after its inception, two competing and seemingly different approaches emerged---one an application of no-regret online learning, the other a sophisticated gradient method applied to a convex-concave saddle-point formulation. Since then, both approaches have grown in relative isolation with advancements on one side not effecting the other. In this paper, we rectify this by dissecting and, in a sense, unify the two views.Comment: AAAI Workshop on Computer Poker and Imperfect Informatio

    Linear Coupling: An Ultimate Unification of Gradient and Mirror Descent

    Get PDF
    First-order methods play a central role in large-scale machine learning. Even though many variations exist, each suited to a particular problem, almost all such methods fundamentally rely on two types of algorithmic steps: gradient descent, which yields primal progress, and mirror descent, which yields dual progress. We observe that the performances of gradient and mirror descent are complementary, so that faster algorithms can be designed by LINEARLY COUPLING the two. We show how to reconstruct Nesterov's accelerated gradient methods using linear coupling, which gives a cleaner interpretation than Nesterov's original proofs. We also discuss the power of linear coupling by extending it to many other settings that Nesterov's methods cannot apply to.Comment: A new section added; polished writin

    A duality-based approach for distributed min-max optimization with application to demand side management

    Full text link
    In this paper we consider a distributed optimization scenario in which a set of processors aims at minimizing the maximum of a collection of "separable convex functions" subject to local constraints. This set-up is motivated by peak-demand minimization problems in smart grids. Here, the goal is to minimize the peak value over a finite horizon with: (i) the demand at each time instant being the sum of contributions from different devices, and (ii) the local states at different time instants being coupled through local dynamics. The min-max structure and the double coupling (through the devices and over the time horizon) makes this problem challenging in a distributed set-up (e.g., well-known distributed dual decomposition approaches cannot be applied). We propose a distributed algorithm based on the combination of duality methods and properties from min-max optimization. Specifically, we derive a series of equivalent problems by introducing ad-hoc slack variables and by going back and forth from primal and dual formulations. On the resulting problem we apply a dual subgradient method, which turns out to be a distributed algorithm. We prove the correctness of the proposed algorithm and show its effectiveness via numerical computations.Comment: arXiv admin note: substantial text overlap with arXiv:1611.0916
    corecore