Search CORE

51 research outputs found

SegICP: Integrated Deep Semantic Segmentation and Pose Estimation

Author: Chipalkatty Rahul
Hamilton Lei
Hebert Mitchell
Johnson David M. S.
Kee Vincent
Le Tiffany
Mariottini Gian-Luca
Schneider Abraham
Torralba Antonio
Wagner Syler
Wong Jay M.
Wu Jimmy
Zhou Bolei
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 05/09/2017
Field of study

Recent robotic manipulation competitions have highlighted that sophisticated robots still struggle to achieve fast and reliable perception of task-relevant objects in complex, realistic scenarios. To improve these systems' perceptive speed and robustness, we present SegICP, a novel integrated solution to object recognition and pose estimation. SegICP couples convolutional neural networks and multi-hypothesis point cloud registration to achieve both robust pixel-wise semantic segmentation as well as accurate and real-time 6-DOF pose estimation for relevant objects. Our architecture achieves 1cm position error and <5^\circ$ angle error in real time without an initial seed. We evaluate and benchmark SegICP against an annotated dataset generated by motion capture.Comment: IROS camera-read

arXiv.org e-Print Archive

Crossref

ReSeg: A Recurrent Neural Network-based Model for Semantic Segmentation

Author: Bengio Yoshua
Cho Kyunghyun
Ciccone Marco
Courville Aaron
Kastner Kyle
Matteucci Matteo
Romero Adriana
Visin Francesco
Publication venue
Publication date: 01/01/2016
Field of study

We propose a structured prediction architecture, which exploits the local generic features extracted by Convolutional Neural Networks and the capacity of Recurrent Neural Networks (RNN) to retrieve distant dependencies. The proposed architecture, called ReSeg, is based on the recently introduced ReNet model for image classification. We modify and extend it to perform the more challenging task of semantic segmentation. Each ReNet layer is composed of four RNN that sweep the image horizontally and vertically in both directions, encoding patches or activations, and providing relevant global information. Moreover, ReNet layers are stacked on top of pre-trained convolutional layers, benefiting from generic local features. Upsampling layers follow ReNet layers to recover the original image resolution in the final predictions. The proposed ReSeg architecture is efficient, flexible and suitable for a variety of semantic segmentation tasks. We evaluate ReSeg on several widely-used semantic segmentation datasets: Weizmann Horse, Oxford Flower, and CamVid; achieving state-of-the-art performance. Results show that ReSeg can act as a suitable architecture for semantic segmentation tasks, and may have further applications in other structured prediction problems. The source code and model hyperparameters are available on https://github.com/fvisin/reseg.Comment: In CVPR Deep Vision Workshop, 201

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Politecnico di Milano

Crossref

Visual Chunking: A List Prediction Framework for Region-Based Object Detection

Author: Bagnell J. Andrew
Hebert Martial
Rhinehart Nicholas
Zhou Jiaji
Publication venue
Publication date: 16/03/2015
Field of study

We consider detecting objects in an image by iteratively selecting from a set of arbitrarily shaped candidate regions. Our generic approach, which we term visual chunking, reasons about the locations of multiple object instances in an image while expressively describing object boundaries. We design an optimization criterion for measuring the performance of a list of such detections as a natural extension to a common per-instance metric. We present an efficient algorithm with provable performance for building a high-quality list of detections from any candidate set of region-based proposals. We also develop a simple class-specific algorithm to generate a candidate region instance in near-linear time in the number of low-level superpixels that outperforms other region generating methods. In order to make predictions on novel images at testing time without access to ground truth, we develop learning approaches to emulate these algorithms' behaviors. We demonstrate that our new approach outperforms sophisticated baselines on benchmark datasets.Comment: to appear at ICRA 201

arXiv.org e-Print Archive

CiteSeerX

Crossref

segDeepM: Exploiting Segmentation and Context in Deep Neural Networks for Object Detection

Author: Fidler Sanja
Salakhutdinov Ruslan
Urtasun Raquel
Zhu Yukun
Publication venue
Publication date: 01/01/2015
Field of study

In this paper, we propose an approach that exploits object segmentation in order to improve the accuracy of object detection. We frame the problem as inference in a Markov Random Field, in which each detection hypothesis scores object appearance as well as contextual information using Convolutional Neural Networks, and allows the hypothesis to choose and score a segment out of a large pool of accurate object segmentation proposals. This enables the detector to incorporate additional evidence when it is available and thus results in more accurate detections. Our experiments show an improvement of 4.1% in mAP over the R-CNN baseline on PASCAL VOC 2010, and 3.4% over the current state-of-the-art, demonstrating the power of our approach

arXiv.org e-Print Archive

CiteSeerX

Crossref