1,366 research outputs found
A General Framework for Flexible Multi-Cue Photometric Point Cloud Registration
The ability to build maps is a key functionality for the majority of mobile
robots. A central ingredient to most mapping systems is the registration or
alignment of the recorded sensor data. In this paper, we present a general
methodology for photometric registration that can deal with multiple different
cues. We provide examples for registering RGBD as well as 3D LIDAR data. In
contrast to popular point cloud registration approaches such as ICP our method
does not rely on explicit data association and exploits multiple modalities
such as raw range and image data streams. Color, depth, and normal information
are handled in an uniform manner and the registration is obtained by minimizing
the pixel-wise difference between two multi-channel images. We developed a
flexible and general framework and implemented our approach inside that
framework. We also released our implementation as open source C++ code. The
experiments show that our approach allows for an accurate registration of the
sensor data without requiring an explicit data association or model-specific
adaptations to datasets or sensors. Our approach exploits the different cues in
a natural and consistent way and the registration can be done at framerate for
a typical range or imaging sensor.Comment: 8 page
Joint Prediction of Depths, Normals and Surface Curvature from RGB Images using CNNs
Understanding the 3D structure of a scene is of vital importance, when it
comes to developing fully autonomous robots. To this end, we present a novel
deep learning based framework that estimates depth, surface normals and surface
curvature by only using a single RGB image. To the best of our knowledge this
is the first work to estimate surface curvature from colour using a machine
learning approach. Additionally, we demonstrate that by tuning the network to
infer well designed features, such as surface curvature, we can achieve
improved performance at estimating depth and normals.This indicates that
network guidance is still a useful aspect of designing and training a neural
network. We run extensive experiments where the network is trained to infer
different tasks while the model capacity is kept constant resulting in
different feature maps based on the tasks at hand. We outperform the previous
state-of-the-art benchmarks which jointly estimate depths and surface normals
while predicting surface curvature in parallel
Recovering 6D Object Pose: A Review and Multi-modal Analysis
A large number of studies analyse object detection and pose estimation at
visual level in 2D, discussing the effects of challenges such as occlusion,
clutter, texture, etc., on the performances of the methods, which work in the
context of RGB modality. Interpreting the depth data, the study in this paper
presents thorough multi-modal analyses. It discusses the above-mentioned
challenges for full 6D object pose estimation in RGB-D images comparing the
performances of several 6D detectors in order to answer the following
questions: What is the current position of the computer vision community for
maintaining "automation" in robotic manipulation? What next steps should the
community take for improving "autonomy" in robotics while handling objects? Our
findings include: (i) reasonably accurate results are obtained on
textured-objects at varying viewpoints with cluttered backgrounds. (ii) Heavy
existence of occlusion and clutter severely affects the detectors, and
similar-looking distractors is the biggest challenge in recovering instances'
6D. (iii) Template-based methods and random forest-based learning algorithms
underlie object detection and 6D pose estimation. Recent paradigm is to learn
deep discriminative feature representations and to adopt CNNs taking RGB images
as input. (iv) Depending on the availability of large-scale 6D annotated depth
datasets, feature representations can be learnt on these datasets, and then the
learnt representations can be customized for the 6D problem
3D Shape Perception from Monocular Vision, Touch, and Shape Priors
Perceiving accurate 3D object shape is important for robots to interact with
the physical world. Current research along this direction has been primarily
relying on visual observations. Vision, however useful, has inherent
limitations due to occlusions and the 2D-3D ambiguities, especially for
perception with a monocular camera. In contrast, touch gets precise local shape
information, though its efficiency for reconstructing the entire shape could be
low. In this paper, we propose a novel paradigm that efficiently perceives
accurate 3D object shape by incorporating visual and tactile observations, as
well as prior knowledge of common object shapes learned from large-scale shape
repositories. We use vision first, applying neural networks with learned shape
priors to predict an object's 3D shape from a single-view color image. We then
use tactile sensing to refine the shape; the robot actively touches the object
regions where the visual prediction has high uncertainty. Our method
efficiently builds the 3D shape of common objects from a color image and a
small number of tactile explorations (around 10). Our setup is easy to apply
and has potentials to help robots better perform grasping or manipulation tasks
on real-world objects.Comment: IROS 2018. The first two authors contributed equally to this wor
- …