2,110 research outputs found
ADMM-SOFTMAX : An ADMM Approach for Multinomial Logistic Regression
We present ADMM-Softmax, an alternating direction method of multipliers
(ADMM) for solving multinomial logistic regression (MLR) problems. Our method
is geared toward supervised classification tasks with many examples and
features. It decouples the nonlinear optimization problem in MLR into three
steps that can be solved efficiently. In particular, each iteration of
ADMM-Softmax consists of a linear least-squares problem, a set of independent
small-scale smooth, convex problems, and a trivial dual variable update.
Solution of the least-squares problem can be be accelerated by pre-computing a
factorization or preconditioner, and the separability in the smooth, convex
problem can be easily parallelized across examples. For two image
classification problems, we demonstrate that ADMM-Softmax leads to improved
generalization compared to a Newton-Krylov, a quasi Newton, and a stochastic
gradient descent method
A Memristor-Based Optimization Framework for AI Applications
Memristors have recently received significant attention as ubiquitous
device-level components for building a novel generation of computing systems.
These devices have many promising features, such as non-volatility, low power
consumption, high density, and excellent scalability. The ability to control
and modify biasing voltages at the two terminals of memristors make them
promising candidates to perform matrix-vector multiplications and solve systems
of linear equations. In this article, we discuss how networks of memristors
arranged in crossbar arrays can be used for efficiently solving optimization
and machine learning problems. We introduce a new memristor-based optimization
framework that combines the computational merit of memristor crossbars with the
advantages of an operator splitting method, alternating direction method of
multipliers (ADMM). Here, ADMM helps in splitting a complex optimization
problem into subproblems that involve the solution of systems of linear
equations. The capability of this framework is shown by applying it to linear
programming, quadratic programming, and sparse optimization. In addition to
ADMM, implementation of a customized power iteration (PI) method for
eigenvalue/eigenvector computation using memristor crossbars is discussed. The
memristor-based PI method can further be applied to principal component
analysis (PCA). The use of memristor crossbars yields a significant speed-up in
computation, and thus, we believe, has the potential to advance optimization
and machine learning research in artificial intelligence (AI)
Distributed Algorithm for Optimal Power Flow on Unbalanced Multiphase Distribution Networks
The optimal power flow (OPF) problem is funda- mental in power distribution
networks control and operation that underlies many important applications such
as volt/var control and demand response, etc.. Large-scale highly volatile
renewable penetration in the distribution networks calls for real-time feed-
back control, and hence the need for distributed solutions for the OPF problem.
Distribution networks are inherently unbalanced and most of the existing
distributed solutions for balanced networks do not apply. In this paper we
propose a solution for unbalanced distribution networks. Our distributed
algorithm is based on alternating direction method of multiplier (ADMM). Unlike
existing approaches that require to solve semidefinite programming problems in
each ADMM macro-iteration, we exploit the problem structures and decompose the
OPF problem in such a way that the subproblems in each ADMM macro- iteration
reduce to either a closed form solution or eigen-decomposition of a 6x6
hermitian matrix, which significantly reduce the convergence time. We present
simulations on IEEE 13, 34, 37 and 123 bus unbalanced distribution networks to
illustrate the scalability and optimality of the proposed algorithm.Comment: 11 page
Optimal parameter selection for the alternating direction method of multipliers (ADMM): quadratic problems
The alternating direction method of multipliers (ADMM) has emerged as a
powerful technique for large-scale structured optimization. Despite many recent
results on the convergence properties of ADMM, a quantitative characterization
of the impact of the algorithm parameters on the convergence times of the
method is still lacking. In this paper we find the optimal algorithm parameters
that minimize the convergence factor of the ADMM iterates in the context of
l2-regularized minimization and constrained quadratic programming. Numerical
examples show that our parameter selection rules significantly outperform
existing alternatives in the literature.Comment: Submitted to IEEE Transactions on Automatic Contro
Near-separable Non-negative Matrix Factorization with - and Bregman Loss Functions
Recently, a family of tractable NMF algorithms have been proposed under the
assumption that the data matrix satisfies a separability condition Donoho &
Stodden (2003); Arora et al. (2012). Geometrically, this condition reformulates
the NMF problem as that of finding the extreme rays of the conical hull of a
finite set of vectors. In this paper, we develop several extensions of the
conical hull procedures of Kumar et al. (2013) for robust ()
approximations and Bregman divergences. Our methods inherit all the advantages
of Kumar et al. (2013) including scalability and noise-tolerance. We show that
on foreground-background separation problems in computer vision, robust
near-separable NMFs match the performance of Robust PCA, considered state of
the art on these problems, with an order of magnitude faster training time. We
also demonstrate applications in exemplar selection settings
Consensus-Based Dantzig-Wolfe Decomposition
Dantzig-Wolfe decomposition (DWD) is a classical algorithm for solving
large-scale linear programs whose constraint matrix involves a set of
independent blocks coupled with a set of linking rows. The algorithm decomposes
such a model into a master problem and a set of independent subproblems that
can be solved in a distributed manner. In a typical implementation, the master
problem is solved centrally. In certain settings, solving the master problem
centrally is undesirable or infeasible, such as in the case of decentralized
storage of data, or when independent agents who are responsible for the
subproblems desire privacy of information. In this paper, we propose a fully
distributed DWD algorithm which relies on solving the master problem using a
consensus-based Alternating Direction Method of Multipliers (ADMM) method. We
derive error bounds on the optimality gap and feasibility violation of the
proposed approach. We provide preliminary computational results for our
algorithm using a Message Passing Interface (MPI) implementation on cutting
stock instances from the literature and synthetic instances where we obtain
high quality solutions.Comment: Corrected typos; added references; added a new set of experiments and
a new applicatio
Iterative Grassmannian Optimization for Robust Image Alignment
Robust high-dimensional data processing has witnessed an exciting development
in recent years, as theoretical results have shown that it is possible using
convex programming to optimize data fit to a low-rank component plus a sparse
outlier component. This problem is also known as Robust PCA, and it has found
application in many areas of computer vision. In image and video processing and
face recognition, the opportunity to process massive image databases is
emerging as people upload photo and video data online in unprecedented volumes.
However, data quality and consistency is not controlled in any way, and the
massiveness of the data poses a serious computational challenge. In this paper
we present t-GRASTA, or "Transformed GRASTA (Grassmannian Robust Adaptive
Subspace Tracking Algorithm)". t-GRASTA iteratively performs incremental
gradient descent constrained to the Grassmann manifold of subspaces in order to
simultaneously estimate a decomposition of a collection of images into a
low-rank subspace, a sparse part of occlusions and foreground objects, and a
transformation such as rotation or translation of the image. We show that
t-GRASTA is 4 faster than state-of-the-art algorithms, has half the
memory requirement, and can achieve alignment for face images as well as
jittered camera surveillance images.Comment: Preprint submitted to the special issue of the Image and Vision
Computing Journal on the theme "The Best of Face and Gesture 2013
Iteration-complexity analysis of a generalized alternating direction method of multipliers
This paper analyzes the iteration-complexity of a generalized alternating
direction method of multipliers (G-ADMM) for solving linearly constrained
convex problems. This ADMM variant, which was first proposed by Bertsekas and
Eckstein, introduces a relaxation parameter into the second
ADMM subproblem. Our approach is to show that the G-ADMM is an instance of a
hybrid proximal extragradient framework with some special properties, and, as a
by product, we obtain ergodic iteration-complexity for the G-ADMM with
, improving and complementing related results in the
literature. Additionally, we also present pointwise iteration-complexity for
the G-ADMM
Scaled Simplex Representation for Subspace Clustering
The self-expressive property of data points, i.e., each data point can be
linearly represented by the other data points in the same subspace, has proven
effective in leading subspace clustering methods. Most self-expressive methods
usually construct a feasible affinity matrix from a coefficient matrix,
obtained by solving an optimization problem. However, the negative entries in
the coefficient matrix are forced to be positive when constructing the affinity
matrix via exponentiation, absolute symmetrization, or squaring operations.
This consequently damages the inherent correlations among the data. Besides,
the affine constraint used in these methods is not flexible enough for
practical applications. To overcome these problems, in this paper, we introduce
a scaled simplex representation (SSR) for subspace clustering problem.
Specifically, the non-negative constraint is used to make the coefficient
matrix physically meaningful, and the coefficient vector is constrained to be
summed up to a scalar s<1 to make it more discriminative. The proposed SSR
based subspace clustering (SSRSC) model is reformulated as a linear
equality-constrained problem, which is solved efficiently under the alternating
direction method of multipliers framework. Experiments on benchmark datasets
demonstrate that the proposed SSRSC algorithm is very efficient and outperforms
state-of-the-art subspace clustering methods on accuracy. The code can be found
at https://github.com/csjunxu/SSRSC.Comment: Accepted by IEEE Transactions on Cybernetics. 13 pages, 9 figures, 10
tables. Code can be found at https://github.com/csjunxu/SSRS
Continuous-time Proportional-Integral Distributed Optimization for Networked Systems
In this paper we explore the relationship between dual decomposition and the
consensus-based method for distributed optimization. The relationship is
developed by examining the similarities between the two approaches and their
relationship to gradient-based constrained optimization. By formulating each
algorithm in continuous-time, it is seen that both approaches use a gradient
method for optimization with one using a proportional control term and the
other using an integral control term to drive the system to the constraint set.
Therefore, a significant contribution of this paper is to combine these methods
to develop a continuous-time proportional-integral distributed optimization
method. Furthermore, we establish convergence using Lyapunov stability
techniques and utilizing properties from the network structure of the
multi-agent system.Comment: 23 Pages, submission to Journal of Control and Decision, under
review. Takes comments from previous review process into account. Reasons for
a continuous approach are given and minor technical details are remedied.
Largest revision is reformatting for the Journal of Control and Decisio
- …