10,846 research outputs found
SegICP: Integrated Deep Semantic Segmentation and Pose Estimation
Recent robotic manipulation competitions have highlighted that sophisticated
robots still struggle to achieve fast and reliable perception of task-relevant
objects in complex, realistic scenarios. To improve these systems' perceptive
speed and robustness, we present SegICP, a novel integrated solution to object
recognition and pose estimation. SegICP couples convolutional neural networks
and multi-hypothesis point cloud registration to achieve both robust pixel-wise
semantic segmentation as well as accurate and real-time 6-DOF pose estimation
for relevant objects. Our architecture achieves 1cm position error and
<5^\circ$ angle error in real time without an initial seed. We evaluate and
benchmark SegICP against an annotated dataset generated by motion capture.Comment: IROS camera-read
ReSeg: A Recurrent Neural Network-based Model for Semantic Segmentation
We propose a structured prediction architecture, which exploits the local
generic features extracted by Convolutional Neural Networks and the capacity of
Recurrent Neural Networks (RNN) to retrieve distant dependencies. The proposed
architecture, called ReSeg, is based on the recently introduced ReNet model for
image classification. We modify and extend it to perform the more challenging
task of semantic segmentation. Each ReNet layer is composed of four RNN that
sweep the image horizontally and vertically in both directions, encoding
patches or activations, and providing relevant global information. Moreover,
ReNet layers are stacked on top of pre-trained convolutional layers, benefiting
from generic local features. Upsampling layers follow ReNet layers to recover
the original image resolution in the final predictions. The proposed ReSeg
architecture is efficient, flexible and suitable for a variety of semantic
segmentation tasks. We evaluate ReSeg on several widely-used semantic
segmentation datasets: Weizmann Horse, Oxford Flower, and CamVid; achieving
state-of-the-art performance. Results show that ReSeg can act as a suitable
architecture for semantic segmentation tasks, and may have further applications
in other structured prediction problems. The source code and model
hyperparameters are available on https://github.com/fvisin/reseg.Comment: In CVPR Deep Vision Workshop, 201
Human-Machine CRFs for Identifying Bottlenecks in Holistic Scene Understanding
Recent trends in image understanding have pushed for holistic scene
understanding models that jointly reason about various tasks such as object
detection, scene recognition, shape analysis, contextual reasoning, and local
appearance based classifiers. In this work, we are interested in understanding
the roles of these different tasks in improved scene understanding, in
particular semantic segmentation, object detection and scene recognition.
Towards this goal, we "plug-in" human subjects for each of the various
components in a state-of-the-art conditional random field model. Comparisons
among various hybrid human-machine CRFs give us indications of how much "head
room" there is to improve scene understanding by focusing research efforts on
various individual tasks
- …