2,110 research outputs found

    ADMM-SOFTMAX : An ADMM Approach for Multinomial Logistic Regression

    Full text link
    We present ADMM-Softmax, an alternating direction method of multipliers (ADMM) for solving multinomial logistic regression (MLR) problems. Our method is geared toward supervised classification tasks with many examples and features. It decouples the nonlinear optimization problem in MLR into three steps that can be solved efficiently. In particular, each iteration of ADMM-Softmax consists of a linear least-squares problem, a set of independent small-scale smooth, convex problems, and a trivial dual variable update. Solution of the least-squares problem can be be accelerated by pre-computing a factorization or preconditioner, and the separability in the smooth, convex problem can be easily parallelized across examples. For two image classification problems, we demonstrate that ADMM-Softmax leads to improved generalization compared to a Newton-Krylov, a quasi Newton, and a stochastic gradient descent method

    A Memristor-Based Optimization Framework for AI Applications

    Full text link
    Memristors have recently received significant attention as ubiquitous device-level components for building a novel generation of computing systems. These devices have many promising features, such as non-volatility, low power consumption, high density, and excellent scalability. The ability to control and modify biasing voltages at the two terminals of memristors make them promising candidates to perform matrix-vector multiplications and solve systems of linear equations. In this article, we discuss how networks of memristors arranged in crossbar arrays can be used for efficiently solving optimization and machine learning problems. We introduce a new memristor-based optimization framework that combines the computational merit of memristor crossbars with the advantages of an operator splitting method, alternating direction method of multipliers (ADMM). Here, ADMM helps in splitting a complex optimization problem into subproblems that involve the solution of systems of linear equations. The capability of this framework is shown by applying it to linear programming, quadratic programming, and sparse optimization. In addition to ADMM, implementation of a customized power iteration (PI) method for eigenvalue/eigenvector computation using memristor crossbars is discussed. The memristor-based PI method can further be applied to principal component analysis (PCA). The use of memristor crossbars yields a significant speed-up in computation, and thus, we believe, has the potential to advance optimization and machine learning research in artificial intelligence (AI)

    Distributed Algorithm for Optimal Power Flow on Unbalanced Multiphase Distribution Networks

    Full text link
    The optimal power flow (OPF) problem is funda- mental in power distribution networks control and operation that underlies many important applications such as volt/var control and demand response, etc.. Large-scale highly volatile renewable penetration in the distribution networks calls for real-time feed- back control, and hence the need for distributed solutions for the OPF problem. Distribution networks are inherently unbalanced and most of the existing distributed solutions for balanced networks do not apply. In this paper we propose a solution for unbalanced distribution networks. Our distributed algorithm is based on alternating direction method of multiplier (ADMM). Unlike existing approaches that require to solve semidefinite programming problems in each ADMM macro-iteration, we exploit the problem structures and decompose the OPF problem in such a way that the subproblems in each ADMM macro- iteration reduce to either a closed form solution or eigen-decomposition of a 6x6 hermitian matrix, which significantly reduce the convergence time. We present simulations on IEEE 13, 34, 37 and 123 bus unbalanced distribution networks to illustrate the scalability and optimality of the proposed algorithm.Comment: 11 page

    Optimal parameter selection for the alternating direction method of multipliers (ADMM): quadratic problems

    Full text link
    The alternating direction method of multipliers (ADMM) has emerged as a powerful technique for large-scale structured optimization. Despite many recent results on the convergence properties of ADMM, a quantitative characterization of the impact of the algorithm parameters on the convergence times of the method is still lacking. In this paper we find the optimal algorithm parameters that minimize the convergence factor of the ADMM iterates in the context of l2-regularized minimization and constrained quadratic programming. Numerical examples show that our parameter selection rules significantly outperform existing alternatives in the literature.Comment: Submitted to IEEE Transactions on Automatic Contro

    Near-separable Non-negative Matrix Factorization with â„“1\ell_1- and Bregman Loss Functions

    Full text link
    Recently, a family of tractable NMF algorithms have been proposed under the assumption that the data matrix satisfies a separability condition Donoho & Stodden (2003); Arora et al. (2012). Geometrically, this condition reformulates the NMF problem as that of finding the extreme rays of the conical hull of a finite set of vectors. In this paper, we develop several extensions of the conical hull procedures of Kumar et al. (2013) for robust (â„“1\ell_1) approximations and Bregman divergences. Our methods inherit all the advantages of Kumar et al. (2013) including scalability and noise-tolerance. We show that on foreground-background separation problems in computer vision, robust near-separable NMFs match the performance of Robust PCA, considered state of the art on these problems, with an order of magnitude faster training time. We also demonstrate applications in exemplar selection settings

    Consensus-Based Dantzig-Wolfe Decomposition

    Full text link
    Dantzig-Wolfe decomposition (DWD) is a classical algorithm for solving large-scale linear programs whose constraint matrix involves a set of independent blocks coupled with a set of linking rows. The algorithm decomposes such a model into a master problem and a set of independent subproblems that can be solved in a distributed manner. In a typical implementation, the master problem is solved centrally. In certain settings, solving the master problem centrally is undesirable or infeasible, such as in the case of decentralized storage of data, or when independent agents who are responsible for the subproblems desire privacy of information. In this paper, we propose a fully distributed DWD algorithm which relies on solving the master problem using a consensus-based Alternating Direction Method of Multipliers (ADMM) method. We derive error bounds on the optimality gap and feasibility violation of the proposed approach. We provide preliminary computational results for our algorithm using a Message Passing Interface (MPI) implementation on cutting stock instances from the literature and synthetic instances where we obtain high quality solutions.Comment: Corrected typos; added references; added a new set of experiments and a new applicatio

    Iterative Grassmannian Optimization for Robust Image Alignment

    Full text link
    Robust high-dimensional data processing has witnessed an exciting development in recent years, as theoretical results have shown that it is possible using convex programming to optimize data fit to a low-rank component plus a sparse outlier component. This problem is also known as Robust PCA, and it has found application in many areas of computer vision. In image and video processing and face recognition, the opportunity to process massive image databases is emerging as people upload photo and video data online in unprecedented volumes. However, data quality and consistency is not controlled in any way, and the massiveness of the data poses a serious computational challenge. In this paper we present t-GRASTA, or "Transformed GRASTA (Grassmannian Robust Adaptive Subspace Tracking Algorithm)". t-GRASTA iteratively performs incremental gradient descent constrained to the Grassmann manifold of subspaces in order to simultaneously estimate a decomposition of a collection of images into a low-rank subspace, a sparse part of occlusions and foreground objects, and a transformation such as rotation or translation of the image. We show that t-GRASTA is 4 ×\times faster than state-of-the-art algorithms, has half the memory requirement, and can achieve alignment for face images as well as jittered camera surveillance images.Comment: Preprint submitted to the special issue of the Image and Vision Computing Journal on the theme "The Best of Face and Gesture 2013

    Iteration-complexity analysis of a generalized alternating direction method of multipliers

    Full text link
    This paper analyzes the iteration-complexity of a generalized alternating direction method of multipliers (G-ADMM) for solving linearly constrained convex problems. This ADMM variant, which was first proposed by Bertsekas and Eckstein, introduces a relaxation parameter α∈(0,2)\alpha \in (0,2) into the second ADMM subproblem. Our approach is to show that the G-ADMM is an instance of a hybrid proximal extragradient framework with some special properties, and, as a by product, we obtain ergodic iteration-complexity for the G-ADMM with α∈(0,2]\alpha\in (0,2], improving and complementing related results in the literature. Additionally, we also present pointwise iteration-complexity for the G-ADMM

    Scaled Simplex Representation for Subspace Clustering

    Full text link
    The self-expressive property of data points, i.e., each data point can be linearly represented by the other data points in the same subspace, has proven effective in leading subspace clustering methods. Most self-expressive methods usually construct a feasible affinity matrix from a coefficient matrix, obtained by solving an optimization problem. However, the negative entries in the coefficient matrix are forced to be positive when constructing the affinity matrix via exponentiation, absolute symmetrization, or squaring operations. This consequently damages the inherent correlations among the data. Besides, the affine constraint used in these methods is not flexible enough for practical applications. To overcome these problems, in this paper, we introduce a scaled simplex representation (SSR) for subspace clustering problem. Specifically, the non-negative constraint is used to make the coefficient matrix physically meaningful, and the coefficient vector is constrained to be summed up to a scalar s<1 to make it more discriminative. The proposed SSR based subspace clustering (SSRSC) model is reformulated as a linear equality-constrained problem, which is solved efficiently under the alternating direction method of multipliers framework. Experiments on benchmark datasets demonstrate that the proposed SSRSC algorithm is very efficient and outperforms state-of-the-art subspace clustering methods on accuracy. The code can be found at https://github.com/csjunxu/SSRSC.Comment: Accepted by IEEE Transactions on Cybernetics. 13 pages, 9 figures, 10 tables. Code can be found at https://github.com/csjunxu/SSRS

    Continuous-time Proportional-Integral Distributed Optimization for Networked Systems

    Get PDF
    In this paper we explore the relationship between dual decomposition and the consensus-based method for distributed optimization. The relationship is developed by examining the similarities between the two approaches and their relationship to gradient-based constrained optimization. By formulating each algorithm in continuous-time, it is seen that both approaches use a gradient method for optimization with one using a proportional control term and the other using an integral control term to drive the system to the constraint set. Therefore, a significant contribution of this paper is to combine these methods to develop a continuous-time proportional-integral distributed optimization method. Furthermore, we establish convergence using Lyapunov stability techniques and utilizing properties from the network structure of the multi-agent system.Comment: 23 Pages, submission to Journal of Control and Decision, under review. Takes comments from previous review process into account. Reasons for a continuous approach are given and minor technical details are remedied. Largest revision is reformatting for the Journal of Control and Decisio
    • …
    corecore