3,915 research outputs found
Improving 6D Pose Estimation of Objects in Clutter via Physics-aware Monte Carlo Tree Search
This work proposes a process for efficiently searching over combinations of
individual object 6D pose hypotheses in cluttered scenes, especially in cases
involving occlusions and objects resting on each other. The initial set of
candidate object poses is generated from state-of-the-art object detection and
global point cloud registration techniques. The best-scored pose per object by
using these techniques may not be accurate due to overlaps and occlusions.
Nevertheless, experimental indications provided in this work show that object
poses with lower ranks may be closer to the real poses than ones with high
ranks according to registration techniques. This motivates a global
optimization process for improving these poses by taking into account
scene-level physical interactions between objects. It also implies that the
Cartesian product of candidate poses for interacting objects must be searched
so as to identify the best scene-level hypothesis. To perform the search
efficiently, the candidate poses for each object are clustered so as to reduce
their number but still keep a sufficient diversity. Then, searching over the
combinations of candidate object poses is performed through a Monte Carlo Tree
Search (MCTS) process that uses the similarity between the observed depth image
of the scene and a rendering of the scene given the hypothesized pose as a
score that guides the search procedure. MCTS handles in a principled way the
tradeoff between fine-tuning the most promising poses and exploring new ones,
by using the Upper Confidence Bound (UCB) technique. Experimental results
indicate that this process is able to quickly identify in cluttered scenes
physically-consistent object poses that are significantly closer to ground
truth compared to poses found by point cloud registration methods.Comment: 8 pages, 4 figure
Image Segmentation Using Weak Shape Priors
The problem of image segmentation is known to become particularly challenging
in the case of partial occlusion of the object(s) of interest, background
clutter, and the presence of strong noise. To overcome this problem, the
present paper introduces a novel approach segmentation through the use of
"weak" shape priors. Specifically, in the proposed method, an segmenting active
contour is constrained to converge to a configuration at which its geometric
parameters attain their empirical probability densities closely matching the
corresponding model densities that are learned based on training samples. It is
shown through numerical experiments that the proposed shape modeling can be
regarded as "weak" in the sense that it minimally influences the segmentation,
which is allowed to be dominated by data-related forces. On the other hand, the
priors provide sufficient constraints to regularize the convergence of
segmentation, while requiring substantially smaller training sets to yield less
biased results as compared to the case of PCA-based regularization methods. The
main advantages of the proposed technique over some existing alternatives is
demonstrated in a series of experiments.Comment: 27 pages, 8 figure
3D Bounding Box Estimation Using Deep Learning and Geometry
We present a method for 3D object detection and pose estimation from a single
image. In contrast to current techniques that only regress the 3D orientation
of an object, our method first regresses relatively stable 3D object properties
using a deep convolutional neural network and then combines these estimates
with geometric constraints provided by a 2D object bounding box to produce a
complete 3D bounding box. The first network output estimates the 3D object
orientation using a novel hybrid discrete-continuous loss, which significantly
outperforms the L2 loss. The second output regresses the 3D object dimensions,
which have relatively little variance compared to alternatives and can often be
predicted for many object types. These estimates, combined with the geometric
constraints on translation imposed by the 2D bounding box, enable us to recover
a stable and accurate 3D object pose. We evaluate our method on the challenging
KITTI object detection benchmark both on the official metric of 3D orientation
estimation and also on the accuracy of the obtained 3D bounding boxes. Although
conceptually simple, our method outperforms more complex and computationally
expensive approaches that leverage semantic segmentation, instance level
segmentation and flat ground priors and sub-category detection. Our
discrete-continuous loss also produces state of the art results for 3D
viewpoint estimation on the Pascal 3D+ dataset.Comment: To appear in IEEE Conference on Computer Vision and Pattern
Recognition (CVPR) 201
Manipulating Highly Deformable Materials Using a Visual Feedback Dictionary
The complex physical properties of highly deformable materials such as
clothes pose significant challenges fanipulation systems. We present a novel
visual feedback dictionary-based method for manipulating defoor autonomous
robotic mrmable objects towards a desired configuration. Our approach is based
on visual servoing and we use an efficient technique to extract key features
from the RGB sensor stream in the form of a histogram of deformable model
features. These histogram features serve as high-level representations of the
state of the deformable material. Next, we collect manipulation data and use a
visual feedback dictionary that maps the velocity in the high-dimensional
feature space to the velocity of the robotic end-effectors for manipulation. We
have evaluated our approach on a set of complex manipulation tasks and
human-robot manipulation tasks on different cloth pieces with varying material
characteristics.Comment: The video is available at goo.gl/mDSC4
- …