12 research outputs found
Monocular Object Instance Segmentation and Depth Ordering with CNNs
In this paper we tackle the problem of instance-level segmentation and depth
ordering from a single monocular image. Towards this goal, we take advantage of
convolutional neural nets and train them to directly predict instance-level
segmentations where the instance ID encodes the depth ordering within image
patches. To provide a coherent single explanation of an image we develop a
Markov random field which takes as input the predictions of convolutional
neural nets applied at overlapping patches of different resolutions, as well as
the output of a connected component algorithm. It aims to predict accurate
instance-level segmentation and depth ordering. We demonstrate the
effectiveness of our approach on the challenging KITTI benchmark and show good
performance on both tasks.Comment: International Conference on Computer Vision (ICCV), 201
Pixelwise Instance Segmentation with a Dynamically Instantiated Network
Semantic segmentation and object detection research have recently achieved
rapid progress. However, the former task has no notion of different instances
of the same object, and the latter operates at a coarse, bounding-box level. We
propose an Instance Segmentation system that produces a segmentation map where
each pixel is assigned an object class and instance identity label. Most
approaches adapt object detectors to produce segments instead of boxes. In
contrast, our method is based on an initial semantic segmentation module, which
feeds into an instance subnetwork. This subnetwork uses the initial
category-level segmentation, along with cues from the output of an object
detector, within an end-to-end CRF to predict instances. This part of our model
is dynamically instantiated to produce a variable number of instances per
image. Our end-to-end approach requires no post-processing and considers the
image holistically, instead of processing independent proposals. Therefore,
unlike some related work, a pixel cannot belong to multiple instances.
Furthermore, far more precise segmentations are achieved, as shown by our
state-of-the-art results (particularly at high IoU thresholds) on the Pascal
VOC and Cityscapes datasets.Comment: CVPR 201
The Cityscapes Dataset for Semantic Urban Scene Understanding
Visual understanding of complex urban street scenes is an enabling factor for
a wide range of applications. Object detection has benefited enormously from
large-scale datasets, especially in the context of deep learning. For semantic
urban scene understanding, however, no current dataset adequately captures the
complexity of real-world urban scenes.
To address this, we introduce Cityscapes, a benchmark suite and large-scale
dataset to train and test approaches for pixel-level and instance-level
semantic labeling. Cityscapes is comprised of a large, diverse set of stereo
video sequences recorded in streets from 50 different cities. 5000 of these
images have high quality pixel-level annotations; 20000 additional images have
coarse annotations to enable methods that leverage large volumes of
weakly-labeled data. Crucially, our effort exceeds previous attempts in terms
of dataset size, annotation richness, scene variability, and complexity. Our
accompanying empirical study provides an in-depth analysis of the dataset
characteristics, as well as a performance evaluation of several
state-of-the-art approaches based on our benchmark.Comment: Includes supplemental materia
Deep context modeling for semantic segmentation
Deep convolutional neural networks (DCNNs) have been employed in many computer vision tasks with great success due to their robustness in feature learning. One of the advantages of DCNNs is their representation robust- ness to object locations, which is useful for object recognition tasks. However, this also discards spatial information, which is useful when dealing with topological information of the image (e.g. scene parsing, face recognition). Adopting graphical models (GMs) to incorporate spatial and contextual information into the DCNNs is expected to improve the performance of DCNN-based computer vision tasks. Recent research has shown that combining DCNNs and Conditional Random Fields (CRFs) can significantly improve scene parsing accuracy. This is achieved either through the combination of their independent outputs or through their application as a cascade. In this work, we propose a novel strategy to incorporate CRFs deeper inside DCNNs by modeling a CRF as a DCNN layer which is pluggable into any layer of a DCNN. This implants spatial and contextual information into the DCNN, allowing end-to-end training, better controlling the spatial constraints and improving segmentation accuracy. The new strategy for coupling graphical models with the state-of-the-art fully convolutional neural network has shown promising results on the PASCAL-Context dataset
Leaf segmentation in plant phenotyping: a collation study
Image-based plant phenotyping is a growing application area of computer vision in agriculture. A key task is the segmentation of all individual leaves in images. Here we focus on the most common rosette model plants, Arabidopsis and young tobacco. Although leaves do share appearance and shape characteristics, the presence of occlusions and variability in leaf shape and pose, as well as imaging conditions, render this problem challenging. The aim of this paper is to compare several leaf segmentation solutions on a unique and first-of-its-kind dataset containing images from typical phenotyping experiments. In particular, we report and discuss methods and findings of a collection of submissions for the first Leaf Segmentation Challenge of the Computer Vision Problems in Plant Phenotyping workshop in 2014. Four methods are presented: three segment leaves by processing the distance transform in an unsupervised fashion, and the other via optimal template selection and Chamfer matching. Overall, we find that although separating plant from background can be accomplished with satisfactory accuracy (>>90 % Dice score), individual leaf segmentation and counting remain challenging when leaves overlap. Additionally, accuracy is lower for younger leaves. We find also that variability in datasets does affect outcomes. Our findings motivate further investigations and development of specialized algorithms for this particular application, and that challenges of this form are ideally suited for advancing the state of the art. Data are publicly available (online at http://​www.​plant-phenotyping.​org/​datasets) to support future challenges beyond segmentation within this application domain