Search CORE

35 research outputs found

MAP inference via Block-Coordinate Frank-Wolfe Algorithm

Author: Kolmogorov Vladimir
Swoboda Paul
Publication venue
Publication date: 01/01/2019
Field of study

We present a new proximal bundle method for Maximum-A-Posteriori (MAP) inference in structured energy minimization problems. The method optimizes a Lagrangean relaxation of the original energy minimization problem using a multi plane block-coordinate Frank-Wolfe method that takes advantage of the specific structure of the Lagrangean decomposition. We show empirically that our method outperforms state-of-the-art Lagrangean decomposition based algorithms on some challenging Markov Random Field, multi-label discrete tomography and graph matching problems

arXiv.org e-Print Archive

Crossref

IST Austria: PubRep (Institute of Science and Technology)

MPG.PuRe

Efficient Linear Programming for Dense CRFs

Author: Ajanthan Thalaiyasingam
Bunel Rudy
Desmaison Alban
Kumar M. Pawan
Salzmann Mathieu
Torr Philip H. S.
Publication venue
Publication date: 01/01/2017
Field of study

The fully connected conditional random field (CRF) with Gaussian pairwise potentials has proven popular and effective for multi-class semantic segmentation. While the energy of a dense CRF can be minimized accurately using a linear programming (LP) relaxation, the state-of-the-art algorithm is too slow to be useful in practice. To alleviate this deficiency, we introduce an efficient LP minimization algorithm for dense CRFs. To this end, we develop a proximal minimization framework, where the dual of each proximal problem is optimized via block coordinate descent. We show that each block of variables can be efficiently optimized. Specifically, for one block, the problem decomposes into significantly smaller subproblems, each of which is defined over a single pixel. For the other block, the problem is optimized via conditional gradient descent. This has two advantages: 1) the conditional gradient can be computed in a time linear in the number of pixels and labels; and 2) the optimal step size can be computed analytically. Our experiments on standard datasets provide compelling evidence that our approach outperforms all existing baselines including the previous LP based approach for dense CRFs.Comment: 24 pages, 10 figures and 4 table

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Crossref

Oxford University Research Archive

Frank-Wolfe Algorithms for Saddle Point Problems

Author: Gidel Gauthier
Jebara Tony
Lacoste-Julien Simon
Publication venue
Publication date: 25/10/2016
Field of study

We extend the Frank-Wolfe (FW) optimization algorithm to solve constrained smooth convex-concave saddle point (SP) problems. Remarkably, the method only requires access to linear minimization oracles. Leveraging recent advances in FW optimization, we provide the first proof of convergence of a FW-type saddle point solver over polytopes, thereby partially answering a 30 year-old conjecture. We also survey other convergence results and highlight gaps in the theoretical underpinnings of FW-style algorithms. Motivating applications without known efficient alternatives are explored through structured prediction with combinatorial penalties as well as games over matching polytopes involving an exponential number of constraints.Comment: Appears in: Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS 2017). 39 page

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Faster Coordinate Descent via Adaptive Importance Sampling

Author: Cevher Volkan
Jaggi Martin
Perekrestenko Dmytro
Publication venue
Publication date: 07/03/2017
Field of study

Coordinate descent methods employ random partial updates of decision variables in order to solve huge-scale convex optimization problems. In this work, we introduce new adaptive rules for the random selection of their updates. By adaptive, we mean that our selection rules are based on the dual residual or the primal-dual gap estimates and can change at each iteration. We theoretically characterize the performance of our selection rules and demonstrate improvements over the state-of-the-art, and extend our theory and algorithms to general convex objectives. Numerical evidence with hinge-loss support vector machines and Lasso confirm that the practice follows the theory.Comment: appearing at AISTATS 201

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Global Convergence of Frank Wolfe on One Hidden Layer Networks

Author: d'Aspremont Alexandre
Pilanci Mert
Publication venue
Publication date: 06/02/2020
Field of study

We derive global convergence bounds for the Frank Wolfe algorithm when training one hidden layer neural networks. When using the ReLU activation function, and under tractable preconditioning assumptions on the sample data set, the linear minimization oracle used to incrementally form the solution can be solved explicitly as a second order cone program. The classical Frank Wolfe algorithm then converges with rate

O(1/T)

where

T

is both the number of neurons and the number of calls to the oracle

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server