409 research outputs found
Block-Coordinate Frank-Wolfe Optimization for Structural SVMs
We propose a randomized block-coordinate variant of the classic Frank-Wolfe
algorithm for convex optimization with block-separable constraints. Despite its
lower iteration cost, we show that it achieves a similar convergence rate in
duality gap as the full Frank-Wolfe algorithm. We also show that, when applied
to the dual structural support vector machine (SVM) objective, this yields an
online algorithm that has the same low iteration complexity as primal
stochastic subgradient methods. However, unlike stochastic subgradient methods,
the block-coordinate Frank-Wolfe algorithm allows us to compute the optimal
step-size and yields a computable duality gap guarantee. Our experiments
indicate that this simple algorithm outperforms competing structural SVM
solvers.Comment: Appears in Proceedings of the 30th International Conference on
Machine Learning (ICML 2013). 9 pages main text + 22 pages appendix. Changes
from v3 to v4: 1) Re-organized appendix; improved & clarified duality gap
proofs; re-drew all plots; 2) Changed convention for Cf definition; 3) Added
weighted averaging experiments + convergence results; 4) Clarified main text
and relationship with appendi
MAP inference via Block-Coordinate Frank-Wolfe Algorithm
We present a new proximal bundle method for Maximum-A-Posteriori (MAP)
inference in structured energy minimization problems. The method optimizes a
Lagrangean relaxation of the original energy minimization problem using a multi
plane block-coordinate Frank-Wolfe method that takes advantage of the specific
structure of the Lagrangean decomposition. We show empirically that our method
outperforms state-of-the-art Lagrangean decomposition based algorithms on some
challenging Markov Random Field, multi-label discrete tomography and graph
matching problems
A PARTAN-Accelerated Frank-Wolfe Algorithm for Large-Scale SVM Classification
Frank-Wolfe algorithms have recently regained the attention of the Machine
Learning community. Their solid theoretical properties and sparsity guarantees
make them a suitable choice for a wide range of problems in this field. In
addition, several variants of the basic procedure exist that improve its
theoretical properties and practical performance. In this paper, we investigate
the application of some of these techniques to Machine Learning, focusing in
particular on a Parallel Tangent (PARTAN) variant of the FW algorithm that has
not been previously suggested or studied for this type of problems. We provide
experiments both in a standard setting and using a stochastic speed-up
technique, showing that the considered algorithms obtain promising results on
several medium and large-scale benchmark datasets for SVM classification
- …