Search CORE

788 research outputs found

Blending Learning and Inference in Structured Prediction

Author: Hazan Tamir
McAllester David
Schwing Alexander
Urtasun Raquel
Publication venue
Publication date: 30/08/2013
Field of study

In this paper we derive an efficient algorithm to learn the parameters of structured predictors in general graphical models. This algorithm blends the learning and inference tasks, which results in a significant speedup over traditional approaches, such as conditional random fields and structured support vector machines. For this purpose we utilize the structures of the predictors to describe a low dimensional structured prediction task which encourages local consistencies within the different structures while learning the parameters of the model. Convexity of the learning task provides the means to enforce the consistencies between the different parts. The inference-learning blending algorithm that we propose is guaranteed to converge to the optimum of the low dimensional primal and dual programs. Unlike many of the existing approaches, the inference-learning blending allows us to learn efficiently high-order graphical models, over regions of any size, and very large number of parameters. We demonstrate the effectiveness of our approach, while presenting state-of-the-art results in stereo estimation, semantic segmentation, shape reconstruction, and indoor scene understanding

arXiv.org e-Print Archive

CiteSeerX

Block Belief Propagation for Parameter Learning in Markov Random Fields

Author: Huang Bert
Liu Zhiyuan
Lu You
Publication venue
Publication date: 09/11/2018
Field of study

Traditional learning methods for training Markov random fields require doing inference over all variables to compute the likelihood gradient. The iteration complexity for those methods therefore scales with the size of the graphical models. In this paper, we propose \emph{block belief propagation learning} (BBPL), which uses block-coordinate updates of approximate marginals to compute approximate gradients, removing the need to compute inference on the entire graphical model. Thus, the iteration complexity of BBPL does not scale with the size of the graphs. We prove that the method converges to the same solution as that obtained by using full inference per iteration, despite these approximations, and we empirically demonstrate its scalability improvements over standard training methods.Comment: Accepted to AAAI 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Learning Deep Structured Models

Author: Chen Liang-Chieh
Schwing Alexander G.
Urtasun Raquel
Yuille Alan L.
Publication venue
Publication date: 27/04/2015
Field of study

Many problems in real-world applications involve predicting several random variables which are statistically related. Markov random fields (MRFs) are a great mathematical tool to encode such relationships. The goal of this paper is to combine MRFs with deep learning algorithms to estimate complex representations while taking into account the dependencies between the output random variables. Towards this goal, we propose a training algorithm that is able to learn structured models jointly with deep features that form the MRF potentials. Our approach is efficient as it blends learning and inference and makes use of GPU acceleration. We demonstrate the effectiveness of our algorithm in the tasks of predicting words from noisy images, as well as multi-class classification of Flickr photographs. We show that joint learning of the deep features and the MRF parameters results in significant performance gains.Comment: 11 pages including referenc

arXiv.org e-Print Archive

CiteSeerX

Learning efficiently with approximate inference via dual losses

Author: Globerson Amir
Jaakkola Tommi S.
Meshi Ofer
Sontag David Alexander
Publication venue: International Machine Learning Society
Publication date: 01/01/2010
Field of study

Many structured prediction tasks involve complex models where inference is computationally intractable, but where it can be well approximated using a linear programming relaxation. Previous approaches for learning for structured prediction (e.g., cutting- plane, subgradient methods, perceptron) repeatedly make predictions for some of the data points. These approaches are computationally demanding because each prediction involves solving a linear program to optimality. We present a scalable algorithm for learning for structured prediction. The main idea is to instead solve the dual of the structured prediction loss. We formulate the learning task as a convex minimization over both the weights and the dual variables corresponding to each data point. As a result, we can begin to optimize the weights even before completely solving any of the individual prediction problems. We show how the dual variables can be efficiently optimized using coordinate descent. Our algorithm is competitive with state-of-the-art methods such as stochastic subgradient and cutting-plane

CiteSeerX

DSpace@MIT

Efficient Structured Prediction with Latent Variables for General Graphical Models

Author: Hazan Tamir
Pollefeys Marc
Schwing Alexander
Urtasun Raquel
Publication venue
Publication date: 01/01/2012
Field of study

In this paper we propose a unified framework for structured prediction with latent variables which includes hidden conditional random fields and latent structured support vector machines as special cases. We describe a local entropy approximation for this general formulation using duality, and derive an efficient message passing algorithm that is guaranteed to converge. We demonstrate its effectiveness in the tasks of image segmentation as well as 3D indoor scene understanding from single images, showing that our approach is superior to latent structured support vector machines and hidden conditional random fields.Comment: Appears in Proceedings of the 29th International Conference on Machine Learning (ICML 2012

arXiv.org e-Print Archive

CiteSeerX