532 research outputs found
Structured Learning of Tree Potentials in CRF for Image Segmentation
We propose a new approach to image segmentation, which exploits the
advantages of both conditional random fields (CRFs) and decision trees. In the
literature, the potential functions of CRFs are mostly defined as a linear
combination of some pre-defined parametric models, and then methods like
structured support vector machines (SSVMs) are applied to learn those linear
coefficients. We instead formulate the unary and pairwise potentials as
nonparametric forests---ensembles of decision trees, and learn the ensemble
parameters and the trees in a unified optimization problem within the
large-margin framework. In this fashion, we easily achieve nonlinear learning
of potential functions on both unary and pairwise terms in CRFs. Moreover, we
learn class-wise decision trees for each object that appears in the image. Due
to the rich structure and flexibility of decision trees, our approach is
powerful in modelling complex data likelihoods and label relationships. The
resulting optimization problem is very challenging because it can have
exponentially many variables and constraints. We show that this challenging
optimization can be efficiently solved by combining a modified column
generation and cutting-planes techniques. Experimental results on both binary
(Graz-02, Weizmann horse, Oxford flower) and multi-class (MSRC-21, PASCAL VOC
2012) segmentation datasets demonstrate the power of the learned nonlinear
nonparametric potentials.Comment: 10 pages. Appearing in IEEE Transactions on Neural Networks and
Learning System
Fast, Exact and Multi-Scale Inference for Semantic Image Segmentation with Deep Gaussian CRFs
In this work we propose a structured prediction technique that combines the
virtues of Gaussian Conditional Random Fields (G-CRF) with Deep Learning: (a)
our structured prediction task has a unique global optimum that is obtained
exactly from the solution of a linear system (b) the gradients of our model
parameters are analytically computed using closed form expressions, in contrast
to the memory-demanding contemporary deep structured prediction approaches that
rely on back-propagation-through-time, (c) our pairwise terms do not have to be
simple hand-crafted expressions, as in the line of works building on the
DenseCRF, but can rather be `discovered' from data through deep architectures,
and (d) out system can trained in an end-to-end manner. Building on standard
tools from numerical analysis we develop very efficient algorithms for
inference and learning, as well as a customized technique adapted to the
semantic segmentation task. This efficiency allows us to explore more
sophisticated architectures for structured prediction in deep learning: we
introduce multi-resolution architectures to couple information across scales in
a joint optimization framework, yielding systematic improvements. We
demonstrate the utility of our approach on the challenging VOC PASCAL 2012
image segmentation benchmark, showing substantial improvements over strong
baselines. We make all of our code and experiments available at
{https://github.com/siddharthachandra/gcrf}Comment: Our code is available at https://github.com/siddharthachandra/gcr
- …