10,181 research outputs found

    Large-Scale Convex Optimization via Saddle Point Computation

    Get PDF
    This article proposes large-scale convex optimization problems to be solved via saddle points of the standard Lagrangian. A recent approach for saddle point computation is specialized, by way of a specific perturbation technique and unique scaling method, to convex optimization problems with differentiable objective and constraint functions. In each iteration the update directions for primal and dual variables are determined by gradients of the Lagrangian. These gradients are evaluated at perturbed points which are generated from current points via auxiliary mappings. The resulting algorithm suits massively parallel computing. Sparsity can be exploited efficiently. Employing simulation of parallel computations, an experimental code embedded into GAMS is tested on two sets of nonlinear problems. The first set arises from multi-stage stochastic optimization of the US energy economy. The second set consists of multi-currency bond portfolio problems. In such stochastic optimization problems the serial time appears approximatively proportional to the number of scenarios, while the parallel time seems independent of the number of scenarios. Thus, we observe that the serial time of our approach in comparison with Minos increases slower with the problem size. Consequently, for large problems with reasonable precision requirements, our method appears faster than Minos even in a serial computer

    A Duality-Based Approach for Distributed Optimization with Coupling Constraints

    Full text link
    In this paper we consider a distributed optimization scenario in which a set of agents has to solve a convex optimization problem with separable cost function, local constraint sets and a coupling inequality constraint. We propose a novel distributed algorithm based on a relaxation of the primal problem and an elegant exploration of duality theory. Despite its complex derivation based on several duality steps, the distributed algorithm has a very simple and intuitive structure. That is, each node solves a local version of the original problem relaxation, and updates suitable dual variables. We prove the algorithm correctness and show its effectiveness via numerical computations

    Online Convex Optimization for Sequential Decision Processes and Extensive-Form Games

    Full text link
    Regret minimization is a powerful tool for solving large-scale extensive-form games. State-of-the-art methods rely on minimizing regret locally at each decision point. In this work we derive a new framework for regret minimization on sequential decision problems and extensive-form games with general compact convex sets at each decision point and general convex losses, as opposed to prior work which has been for simplex decision points and linear losses. We call our framework laminar regret decomposition. It generalizes the CFR algorithm to this more general setting. Furthermore, our framework enables a new proof of CFR even in the known setting, which is derived from a perspective of decomposing polytope regret, thereby leading to an arguably simpler interpretation of the algorithm. Our generalization to convex compact sets and convex losses allows us to develop new algorithms for several problems: regularized sequential decision making, regularized Nash equilibria in extensive-form games, and computing approximate extensive-form perfect equilibria. Our generalization also leads to the first regret-minimization algorithm for computing reduced-normal-form quantal response equilibria based on minimizing local regrets. Experiments show that our framework leads to algorithms that scale at a rate comparable to the fastest variants of counterfactual regret minimization for computing Nash equilibrium, and therefore our approach leads to the first algorithm for computing quantal response equilibria in extremely large games. Finally we show that our framework enables a new kind of scalable opponent exploitation approach

    Kernel Exponential Family Estimation via Doubly Dual Embedding

    Get PDF
    We investigate penalized maximum log-likelihood estimation for exponential family distributions whose natural parameter resides in a reproducing kernel Hilbert space. Key to our approach is a novel technique, doubly dual embedding, that avoids computation of the partition function. This technique also allows the development of a flexible sampling strategy that amortizes the cost of Monte-Carlo sampling in the inference stage. The resulting estimator can be easily generalized to kernel conditional exponential families. We establish a connection between kernel exponential family estimation and MMD-GANs, revealing a new perspective for understanding GANs. Compared to the score matching based estimators, the proposed method improves both memory and time efficiency while enjoying stronger statistical properties, such as fully capturing smoothness in its statistical convergence rate while the score matching estimator appears to saturate. Finally, we show that the proposed estimator empirically outperforms state-of-the-artComment: 22 pages, 20 figures; AISTATS 201

    Semi-proximal Mirror-Prox for Nonsmooth Composite Minimization

    Get PDF
    We propose a new first-order optimisation algorithm to solve high-dimensional non-smooth composite minimisation problems. Typical examples of such problems have an objective that decomposes into a non-smooth empirical risk part and a non-smooth regularisation penalty. The proposed algorithm, called Semi-Proximal Mirror-Prox, leverages the Fenchel-type representation of one part of the objective while handling the other part of the objective via linear minimization over the domain. The algorithm stands in contrast with more classical proximal gradient algorithms with smoothing, which require the computation of proximal operators at each iteration and can therefore be impractical for high-dimensional problems. We establish the theoretical convergence rate of Semi-Proximal Mirror-Prox, which exhibits the optimal complexity bounds, i.e. O(1/ϵ2)O(1/\epsilon^2), for the number of calls to linear minimization oracle. We present promising experimental results showing the interest of the approach in comparison to competing methods

    Identifying and attacking the saddle point problem in high-dimensional non-convex optimization

    Full text link
    A central challenge to many fields of science and engineering involves minimizing non-convex error functions over continuous, high dimensional spaces. Gradient descent or quasi-Newton methods are almost ubiquitously used to perform such minimizations, and it is often thought that a main source of difficulty for these local methods to find the global minimum is the proliferation of local minima with much higher error than the global minimum. Here we argue, based on results from statistical physics, random matrix theory, neural network theory, and empirical evidence, that a deeper and more profound difficulty originates from the proliferation of saddle points, not local minima, especially in high dimensional problems of practical interest. Such saddle points are surrounded by high error plateaus that can dramatically slow down learning, and give the illusory impression of the existence of a local minimum. Motivated by these arguments, we propose a new approach to second-order optimization, the saddle-free Newton method, that can rapidly escape high dimensional saddle points, unlike gradient descent and quasi-Newton methods. We apply this algorithm to deep or recurrent neural network training, and provide numerical evidence for its superior optimization performance.Comment: The theoretical review and analysis in this article draw heavily from arXiv:1405.4604 [cs.LG

    Optimization Methods for Inverse Problems

    Full text link
    Optimization plays an important role in solving many inverse problems. Indeed, the task of inversion often either involves or is fully cast as a solution of an optimization problem. In this light, the mere non-linear, non-convex, and large-scale nature of many of these inversions gives rise to some very challenging optimization problems. The inverse problem community has long been developing various techniques for solving such optimization tasks. However, other, seemingly disjoint communities, such as that of machine learning, have developed, almost in parallel, interesting alternative methods which might have stayed under the radar of the inverse problem community. In this survey, we aim to change that. In doing so, we first discuss current state-of-the-art optimization methods widely used in inverse problems. We then survey recent related advances in addressing similar challenges in problems faced by the machine learning community, and discuss their potential advantages for solving inverse problems. By highlighting the similarities among the optimization challenges faced by the inverse problem and the machine learning communities, we hope that this survey can serve as a bridge in bringing together these two communities and encourage cross fertilization of ideas.Comment: 13 page
    corecore