1,270 research outputs found
Tree-structured CRF Models for Interactive Image Labeling
International audienceWe propose structured prediction models for image labeling that explicitly take into account dependencies among image labels. In our tree structured models, image labels are nodes, and edges encode dependency relations. To allow for more complex dependencies, we combine labels in a single node, and use mixtures of trees. Our models are more expressive than independent predictors, and lead to more accurate label predictions. The gain becomes more significant in an interactive scenario where a user provides the value of some of the image labels at test time. Such an interactive scenario offers an interesting trade-off between label accuracy and manual labeling effort. The structured models are used to decide which labels should be set by the user, and transfer the user input to more accurate predictions on other image labels. We also apply our models to attribute-based image classification, where attribute predictions of a test image are mapped to class probabilities by means of a given attribute-class mapping. Experimental results on three publicly available benchmark data sets show that in all scenarios our structured models lead to more accurate predictions, and leverage user input much more effectively than state-of-the-art independent models
Structured learning of sum-of-submodular higher order energy functions
Submodular functions can be exactly minimized in polynomial time, and the
special case that graph cuts solve with max flow \cite{KZ:PAMI04} has had
significant impact in computer vision
\cite{BVZ:PAMI01,Kwatra:SIGGRAPH03,Rother:GrabCut04}. In this paper we address
the important class of sum-of-submodular (SoS) functions
\cite{Arora:ECCV12,Kolmogorov:DAM12}, which can be efficiently minimized via a
variant of max flow called submodular flow \cite{Edmonds:ADM77}. SoS functions
can naturally express higher order priors involving, e.g., local image patches;
however, it is difficult to fully exploit their expressive power because they
have so many parameters. Rather than trying to formulate existing higher order
priors as an SoS function, we take a discriminative learning approach,
effectively searching the space of SoS functions for a higher order prior that
performs well on our training set. We adopt a structural SVM approach
\cite{Joachims/etal/09a,Tsochantaridis/etal/04} and formulate the training
problem in terms of quadratic programming; as a result we can efficiently
search the space of SoS priors via an extended cutting-plane algorithm. We also
show how the state-of-the-art max flow method for vision problems
\cite{Goldberg:ESA11} can be modified to efficiently solve the submodular flow
problem. Experimental comparisons are made against the OpenCV implementation of
the GrabCut interactive segmentation technique \cite{Rother:GrabCut04}, which
uses hand-tuned parameters instead of machine learning. On a standard dataset
\cite{Gulshan:CVPR10} our method learns higher order priors with hundreds of
parameter values, and produces significantly better segmentations. While our
focus is on binary labeling problems, we show that our techniques can be
naturally generalized to handle more than two labels
Human-Machine CRFs for Identifying Bottlenecks in Holistic Scene Understanding
Recent trends in image understanding have pushed for holistic scene
understanding models that jointly reason about various tasks such as object
detection, scene recognition, shape analysis, contextual reasoning, and local
appearance based classifiers. In this work, we are interested in understanding
the roles of these different tasks in improved scene understanding, in
particular semantic segmentation, object detection and scene recognition.
Towards this goal, we "plug-in" human subjects for each of the various
components in a state-of-the-art conditional random field model. Comparisons
among various hybrid human-machine CRFs give us indications of how much "head
room" there is to improve scene understanding by focusing research efforts on
various individual tasks
- …