116 research outputs found
Predicting Out-of-View Feature Points for Model-Based Camera Pose Estimation
In this work we present a novel framework that uses deep learning to predict
object feature points that are out-of-view in the input image. This system was
developed with the application of model-based tracking in mind, particularly in
the case of autonomous inspection robots, where only partial views of the
object are available. Out-of-view prediction is enabled by applying scaling to
the feature point labels during network training. This is combined with a
recurrent neural network architecture designed to provide the final prediction
layers with rich feature information from across the spatial extent of the
input image. To show the versatility of these out-of-view predictions, we
describe how to integrate them in both a particle filter tracker and an
optimisation based tracker. To evaluate our work we compared our framework with
one that predicts only points inside the image. We show that as the amount of
the object in view decreases, being able to predict outside the image bounds
adds robustness to the final pose estimation.Comment: Submitted to IROS 201
Factoring Shape, Pose, and Layout from the 2D Image of a 3D Scene
The goal of this paper is to take a single 2D image of a scene and recover
the 3D structure in terms of a small set of factors: a layout representing the
enclosing surfaces as well as a set of objects represented in terms of shape
and pose. We propose a convolutional neural network-based approach to predict
this representation and benchmark it on a large dataset of indoor scenes. Our
experiments evaluate a number of practical design questions, demonstrate that
we can infer this representation, and quantitatively and qualitatively
demonstrate its merits compared to alternate representations.Comment: Project url with code: https://shubhtuls.github.io/factored3
3D Bounding Box Estimation Using Deep Learning and Geometry
We present a method for 3D object detection and pose estimation from a single
image. In contrast to current techniques that only regress the 3D orientation
of an object, our method first regresses relatively stable 3D object properties
using a deep convolutional neural network and then combines these estimates
with geometric constraints provided by a 2D object bounding box to produce a
complete 3D bounding box. The first network output estimates the 3D object
orientation using a novel hybrid discrete-continuous loss, which significantly
outperforms the L2 loss. The second output regresses the 3D object dimensions,
which have relatively little variance compared to alternatives and can often be
predicted for many object types. These estimates, combined with the geometric
constraints on translation imposed by the 2D bounding box, enable us to recover
a stable and accurate 3D object pose. We evaluate our method on the challenging
KITTI object detection benchmark both on the official metric of 3D orientation
estimation and also on the accuracy of the obtained 3D bounding boxes. Although
conceptually simple, our method outperforms more complex and computationally
expensive approaches that leverage semantic segmentation, instance level
segmentation and flat ground priors and sub-category detection. Our
discrete-continuous loss also produces state of the art results for 3D
viewpoint estimation on the Pascal 3D+ dataset.Comment: To appear in IEEE Conference on Computer Vision and Pattern
Recognition (CVPR) 201
- …