6,952 research outputs found
Multi-utility Learning: Structured-output Learning with Multiple Annotation-specific Loss Functions
Structured-output learning is a challenging problem; particularly so because
of the difficulty in obtaining large datasets of fully labelled instances for
training. In this paper we try to overcome this difficulty by presenting a
multi-utility learning framework for structured prediction that can learn from
training instances with different forms of supervision. We propose a unified
technique for inferring the loss functions most suitable for quantifying the
consistency of solutions with the given weak annotation. We demonstrate the
effectiveness of our framework on the challenging semantic image segmentation
problem for which a wide variety of annotations can be used. For instance, the
popular training datasets for semantic segmentation are composed of images with
hard-to-generate full pixel labellings, as well as images with easy-to-obtain
weak annotations, such as bounding boxes around objects, or image-level labels
that specify which object categories are present in an image. Experimental
evaluation shows that the use of annotation-specific loss functions
dramatically improves segmentation accuracy compared to the baseline system
where only one type of weak annotation is used
Semantic Image Segmentation via Deep Parsing Network
This paper addresses semantic image segmentation by incorporating rich
information into Markov Random Field (MRF), including high-order relations and
mixture of label contexts. Unlike previous works that optimized MRFs using
iterative algorithm, we solve MRF by proposing a Convolutional Neural Network
(CNN), namely Deep Parsing Network (DPN), which enables deterministic
end-to-end computation in a single forward pass. Specifically, DPN extends a
contemporary CNN architecture to model unary terms and additional layers are
carefully devised to approximate the mean field algorithm (MF) for pairwise
terms. It has several appealing properties. First, different from the recent
works that combined CNN and MRF, where many iterations of MF were required for
each training image during back-propagation, DPN is able to achieve high
performance by approximating one iteration of MF. Second, DPN represents
various types of pairwise terms, making many existing works as its special
cases. Third, DPN makes MF easier to be parallelized and speeded up in
Graphical Processing Unit (GPU). DPN is thoroughly evaluated on the PASCAL VOC
2012 dataset, where a single DPN model yields a new state-of-the-art
segmentation accuracy.Comment: To appear in International Conference on Computer Vision (ICCV) 201
Joint Optical Flow and Temporally Consistent Semantic Segmentation
The importance and demands of visual scene understanding have been steadily
increasing along with the active development of autonomous systems.
Consequently, there has been a large amount of research dedicated to semantic
segmentation and dense motion estimation. In this paper, we propose a method
for jointly estimating optical flow and temporally consistent semantic
segmentation, which closely connects these two problem domains and leverages
each other. Semantic segmentation provides information on plausible physical
motion to its associated pixels, and accurate pixel-level temporal
correspondences enhance the accuracy of semantic segmentation in the temporal
domain. We demonstrate the benefits of our approach on the KITTI benchmark,
where we observe performance gains for flow and segmentation. We achieve
state-of-the-art optical flow results, and outperform all published algorithms
by a large margin on challenging, but crucial dynamic objects.Comment: 14 pages, Accepted for CVRSUAD workshop at ECCV 201
- …