31 research outputs found
Polynomial Linear Programming with Gaussian Belief Propagation
Interior-point methods are state-of-the-art algorithms for solving linear
programming (LP) problems with polynomial complexity. Specifically, the
Karmarkar algorithm typically solves LP problems in time O(n^{3.5}), where
is the number of unknown variables. Karmarkar's celebrated algorithm is known
to be an instance of the log-barrier method using the Newton iteration. The
main computational overhead of this method is in inverting the Hessian matrix
of the Newton iteration. In this contribution, we propose the application of
the Gaussian belief propagation (GaBP) algorithm as part of an efficient and
distributed LP solver that exploits the sparse and symmetric structure of the
Hessian matrix and avoids the need for direct matrix inversion. This approach
shifts the computation from realm of linear algebra to that of probabilistic
inference on graphical models, thus applying GaBP as an efficient inference
engine. Our construction is general and can be used for any interior-point
algorithm which uses the Newton method, including non-linear program solvers.Comment: 7 pages, 1 figure, appeared in the 46th Annual Allerton Conference on
Communication, Control and Computing, Allerton House, Illinois, Sept. 200
Convergence of the Exponentiated Gradient Method with Armijo Line Search
Consider the problem of minimizing a convex differentiable function on the
probability simplex, spectrahedron, or set of quantum density matrices. We
prove that the exponentiated gradient method with Armjo line search always
converges to the optimum, if the sequence of the iterates possesses a strictly
positive limit point (element-wise for the vector case, and with respect to the
Lowner partial ordering for the matrix case). To the best our knowledge, this
is the first convergence result for a mirror descent-type method that only
requires differentiability. The proof exploits self-concordant likeness of the
log-partition function, which is of independent interest.Comment: 18 page
Stochastic Optimization of PCA with Capped MSG
We study PCA as a stochastic optimization problem and propose a novel
stochastic approximation algorithm which we refer to as "Matrix Stochastic
Gradient" (MSG), as well as a practical variant, Capped MSG. We study the
method both theoretically and empirically
Asynchronous Parallel Block-Coordinate Frank-Wolfe
Abstract We develop mini-batched parallel Frank-Wolfe (conditional gradient) methods for smooth convex optimization subject to block-separable constraints. Our work includes the basic (batch) Frank-Wolfe algorithm as well as the recently proposed Block-Coordinate Frank-Wolfe (BCFW) method [18] as special cases. Our algorithm permits asynchronous updates within the minibatch, and is robust to stragglers and faulty worker threads. Our analysis reveals how the potential speedups over BCFW depend on the minibatch size and how one can provably obtain large problem dependent speedups. We present several experiments to indicate empirical behavior of our methods, obtaining significant speedups over competing state-of-the-art (and synchronous) methods on structural SVMs
Stochastic Frank-Wolfe Methods for Nonconvex Optimization
We study Frank-Wolfe methods for nonconvex stochastic and finite-sum
optimization problems. Frank-Wolfe methods (in the convex case) have gained
tremendous recent interest in machine learning and optimization communities due
to their projection-free property and their ability to exploit structured
constraints. However, our understanding of these algorithms in the nonconvex
setting is fairly limited. In this paper, we propose nonconvex stochastic
Frank-Wolfe methods and analyze their convergence properties. For objective
functions that decompose into a finite-sum, we leverage ideas from variance
reduction techniques for convex optimization to obtain new variance reduced
nonconvex Frank-Wolfe methods that have provably faster convergence than the
classical Frank-Wolfe method. Finally, we show that the faster convergence
rates of our variance reduced methods also translate into improved convergence
rates for the stochastic setting
A Multi-Plane Block-Coordinate Frank-Wolfe Algorithm for Training Structural SVMs with a Costly max-Oracle
Structural support vector machines (SSVMs) are amongst the best performing
models for structured computer vision tasks, such as semantic image
segmentation or human pose estimation. Training SSVMs, however, is
computationally costly, because it requires repeated calls to a structured
prediction subroutine (called \emph{max-oracle}), which has to solve an
optimization problem itself, e.g. a graph cut.
In this work, we introduce a new algorithm for SSVM training that is more
efficient than earlier techniques when the max-oracle is computationally
expensive, as it is frequently the case in computer vision tasks. The main idea
is to (i) combine the recent stochastic Block-Coordinate Frank-Wolfe algorithm
with efficient hyperplane caching, and (ii) use an automatic selection rule for
deciding whether to call the exact max-oracle or to rely on an approximate one
based on the cached hyperplanes.
We show experimentally that this strategy leads to faster convergence to the
optimum with respect to the number of requires oracle calls, and that this
translates into faster convergence with respect to the total runtime when the
max-oracle is slow compared to the other steps of the algorithm.
A publicly available C++ implementation is provided at
http://pub.ist.ac.at/~vnk/papers/SVM.html