867 research outputs found
GPstruct: Bayesian structured prediction using Gaussian processes
We introduce a conceptually novel structured prediction model, GPstruct, which is kernelized, non-parametric and Bayesian, by design. We motivate the model with respect to existing approaches, among others, conditional random fields (CRFs), maximum margin Markov networks (M ^3 N), and structured support vector machines (SVMstruct), which embody only a subset of its properties. We present an inference procedure based on Markov Chain Monte Carlo. The framework can be instantiated for a wide range of structured objects such as linear chains, trees, grids, and other general graphs. As a proof of concept, the model is benchmarked on several natural language processing tasks and a video gesture segmentation task involving a linear chain structure. We show prediction accuracies for GPstruct which are comparable to or exceeding those of CRFs and SVMstruct
Integrated Inference and Learning of Neural Factors in Structural Support Vector Machines
Tackling pattern recognition problems in areas such as computer vision,
bioinformatics, speech or text recognition is often done best by taking into
account task-specific statistical relations between output variables. In
structured prediction, this internal structure is used to predict multiple
outputs simultaneously, leading to more accurate and coherent predictions.
Structural support vector machines (SSVMs) are nonprobabilistic models that
optimize a joint input-output function through margin-based learning. Because
SSVMs generally disregard the interplay between unary and interaction factors
during the training phase, final parameters are suboptimal. Moreover, its
factors are often restricted to linear combinations of input features, limiting
its generalization power. To improve prediction accuracy, this paper proposes:
(i) Joint inference and learning by integration of back-propagation and
loss-augmented inference in SSVM subgradient descent; (ii) Extending SSVM
factors to neural networks that form highly nonlinear functions of input
features. Image segmentation benchmark results demonstrate improvements over
conventional SSVM training methods in terms of accuracy, highlighting the
feasibility of end-to-end SSVM training with neural factors
Structured Learning of Tree Potentials in CRF for Image Segmentation
We propose a new approach to image segmentation, which exploits the
advantages of both conditional random fields (CRFs) and decision trees. In the
literature, the potential functions of CRFs are mostly defined as a linear
combination of some pre-defined parametric models, and then methods like
structured support vector machines (SSVMs) are applied to learn those linear
coefficients. We instead formulate the unary and pairwise potentials as
nonparametric forests---ensembles of decision trees, and learn the ensemble
parameters and the trees in a unified optimization problem within the
large-margin framework. In this fashion, we easily achieve nonlinear learning
of potential functions on both unary and pairwise terms in CRFs. Moreover, we
learn class-wise decision trees for each object that appears in the image. Due
to the rich structure and flexibility of decision trees, our approach is
powerful in modelling complex data likelihoods and label relationships. The
resulting optimization problem is very challenging because it can have
exponentially many variables and constraints. We show that this challenging
optimization can be efficiently solved by combining a modified column
generation and cutting-planes techniques. Experimental results on both binary
(Graz-02, Weizmann horse, Oxford flower) and multi-class (MSRC-21, PASCAL VOC
2012) segmentation datasets demonstrate the power of the learned nonlinear
nonparametric potentials.Comment: 10 pages. Appearing in IEEE Transactions on Neural Networks and
Learning System
CRF Learning with CNN Features for Image Segmentation
Conditional Random Rields (CRF) have been widely applied in image
segmentations. While most studies rely on hand-crafted features, we here
propose to exploit a pre-trained large convolutional neural network (CNN) to
generate deep features for CRF learning. The deep CNN is trained on the
ImageNet dataset and transferred to image segmentations here for constructing
potentials of superpixels. Then the CRF parameters are learnt using a
structured support vector machine (SSVM). To fully exploit context information
in inference, we construct spatially related co-occurrence pairwise potentials
and incorporate them into the energy function. This prefers labelling of object
pairs that frequently co-occur in a certain spatial layout and at the same time
avoids implausible labellings during the inference. Extensive experiments on
binary and multi-class segmentation benchmarks demonstrate the promise of the
proposed method. We thus provide new baselines for the segmentation performance
on the Weizmann horse, Graz-02, MSRC-21, Stanford Background and PASCAL VOC
2011 datasets
ReSeg: A Recurrent Neural Network-based Model for Semantic Segmentation
We propose a structured prediction architecture, which exploits the local
generic features extracted by Convolutional Neural Networks and the capacity of
Recurrent Neural Networks (RNN) to retrieve distant dependencies. The proposed
architecture, called ReSeg, is based on the recently introduced ReNet model for
image classification. We modify and extend it to perform the more challenging
task of semantic segmentation. Each ReNet layer is composed of four RNN that
sweep the image horizontally and vertically in both directions, encoding
patches or activations, and providing relevant global information. Moreover,
ReNet layers are stacked on top of pre-trained convolutional layers, benefiting
from generic local features. Upsampling layers follow ReNet layers to recover
the original image resolution in the final predictions. The proposed ReSeg
architecture is efficient, flexible and suitable for a variety of semantic
segmentation tasks. We evaluate ReSeg on several widely-used semantic
segmentation datasets: Weizmann Horse, Oxford Flower, and CamVid; achieving
state-of-the-art performance. Results show that ReSeg can act as a suitable
architecture for semantic segmentation tasks, and may have further applications
in other structured prediction problems. The source code and model
hyperparameters are available on https://github.com/fvisin/reseg.Comment: In CVPR Deep Vision Workshop, 201
- …