667 research outputs found
Probabilistic RGB-D Odometry based on Points, Lines and Planes Under Depth Uncertainty
This work proposes a robust visual odometry method for structured
environments that combines point features with line and plane segments,
extracted through an RGB-D camera. Noisy depth maps are processed by a
probabilistic depth fusion framework based on Mixtures of Gaussians to denoise
and derive the depth uncertainty, which is then propagated throughout the
visual odometry pipeline. Probabilistic 3D plane and line fitting solutions are
used to model the uncertainties of the feature parameters and pose is estimated
by combining the three types of primitives based on their uncertainties.
Performance evaluation on RGB-D sequences collected in this work and two public
RGB-D datasets: TUM and ICL-NUIM show the benefit of using the proposed depth
fusion framework and combining the three feature-types, particularly in scenes
with low-textured surfaces, dynamic objects and missing depth measurements.Comment: Major update: more results, depth filter released as opensource, 34
page
Robust Legged Robot State Estimation Using Factor Graph Optimization
Legged robots, specifically quadrupeds, are becoming increasingly attractive
for industrial applications such as inspection. However, to leave the
laboratory and to become useful to an end user requires reliability in harsh
conditions. From the perspective of state estimation, it is essential to be
able to accurately estimate the robot's state despite challenges such as uneven
or slippery terrain, textureless and reflective scenes, as well as dynamic
camera occlusions. We are motivated to reduce the dependency on foot contact
classifications, which fail when slipping, and to reduce position drift during
dynamic motions such as trotting. To this end, we present a factor graph
optimization method for state estimation which tightly fuses and smooths
inertial navigation, leg odometry and visual odometry. The effectiveness of the
approach is demonstrated using the ANYmal quadruped robot navigating in a
realistic outdoor industrial environment. This experiment included trotting,
walking, crossing obstacles and ascending a staircase. The proposed approach
decreased the relative position error by up to 55% and absolute position error
by 76% compared to kinematic-inertial odometry.Comment: 8 pages, 12 figures. Accepted to RA-L + IROS 2019, July 201
A General Framework for Flexible Multi-Cue Photometric Point Cloud Registration
The ability to build maps is a key functionality for the majority of mobile
robots. A central ingredient to most mapping systems is the registration or
alignment of the recorded sensor data. In this paper, we present a general
methodology for photometric registration that can deal with multiple different
cues. We provide examples for registering RGBD as well as 3D LIDAR data. In
contrast to popular point cloud registration approaches such as ICP our method
does not rely on explicit data association and exploits multiple modalities
such as raw range and image data streams. Color, depth, and normal information
are handled in an uniform manner and the registration is obtained by minimizing
the pixel-wise difference between two multi-channel images. We developed a
flexible and general framework and implemented our approach inside that
framework. We also released our implementation as open source C++ code. The
experiments show that our approach allows for an accurate registration of the
sensor data without requiring an explicit data association or model-specific
adaptations to datasets or sensors. Our approach exploits the different cues in
a natural and consistent way and the registration can be done at framerate for
a typical range or imaging sensor.Comment: 8 page
Depth sensors in augmented reality solutions. Literature review
The emergence of depth sensors has made it possible to track – not only monocular
cues – but also the actual depth values of the environment. This is especially
useful in augmented reality solutions, where the position and orientation (pose) of
the observer need to be accurately determined. This allows virtual objects to be
installed to the view of the user through, for example, a screen of a tablet or augmented
reality glasses (e.g. Google glass, etc.). Although the early 3D sensors have
been physically quite large, the size of these sensors is decreasing, and possibly –
eventually – a 3D sensor could be embedded – for example – to augmented reality
glasses. The wider subject area considered in this review is 3D SLAM methods,
which take advantage of the 3D information available by modern RGB-D sensors,
such as Microsoft Kinect. Thus the review for SLAM (Simultaneous Localization
and Mapping) and 3D tracking in augmented reality is a timely subject. We also try
to find out the limitations and possibilities of different tracking methods, and how
they should be improved, in order to allow efficient integration of the methods to
the augmented reality solutions of the future.Siirretty Doriast
Depth sensors in augmented reality solutions. Literature review
The emergence of depth sensors has made it possible to track – not only monocular
cues – but also the actual depth values of the environment. This is especially
useful in augmented reality solutions, where the position and orientation (pose) of
the observer need to be accurately determined. This allows virtual objects to be
installed to the view of the user through, for example, a screen of a tablet or augmented
reality glasses (e.g. Google glass, etc.). Although the early 3D sensors have
been physically quite large, the size of these sensors is decreasing, and possibly –
eventually – a 3D sensor could be embedded – for example – to augmented reality
glasses. The wider subject area considered in this review is 3D SLAM methods,
which take advantage of the 3D information available by modern RGB-D sensors,
such as Microsoft Kinect. Thus the review for SLAM (Simultaneous Localization
and Mapping) and 3D tracking in augmented reality is a timely subject. We also try
to find out the limitations and possibilities of different tracking methods, and how
they should be improved, in order to allow efficient integration of the methods to
the augmented reality solutions of the future.Siirretty Doriast
Visual Perception For Robotic Spatial Understanding
Humans understand the world through vision without much effort. We perceive the structure, objects, and people in the environment and pay little direct attention to most of it, until it becomes useful. Intelligent systems, especially mobile robots, have no such biologically engineered vision mechanism to take for granted. In contrast, we must devise algorithmic methods of taking raw sensor data and converting it to something useful very quickly. Vision is such a necessary part of building a robot or any intelligent system that is meant to interact with the world that it is somewhat surprising we don\u27t have off-the-shelf libraries for this capability.
Why is this? The simple answer is that the problem is extremely difficult. There has been progress, but the current state of the art is impressive and depressing at the same time. We now have neural networks that can recognize many objects in 2D images, in some cases performing better than a human. Some algorithms can also provide bounding boxes or pixel-level masks to localize the object. We have visual odometry and mapping algorithms that can build reasonably detailed maps over long distances with the right hardware and conditions. On the other hand, we have robots with many sensors and no efficient way to compute their relative extrinsic poses for integrating the data in a single frame. The same networks that produce good object segmentations and labels in a controlled benchmark still miss obvious objects in the real world and have no mechanism for learning on the fly while the robot is exploring. Finally, while we can detect pose for very specific objects, we don\u27t yet have a mechanism that detects pose that generalizes well over categories or that can describe new objects efficiently.
We contribute algorithms in four of the areas mentioned above. First, we describe a practical and effective system for calibrating many sensors on a robot with up to 3 different modalities. Second, we present our approach to visual odometry and mapping that exploits the unique capabilities of RGB-D sensors to efficiently build detailed representations of an environment. Third, we describe a 3-D over-segmentation technique that utilizes the models and ego-motion output in the previous step to generate temporally consistent segmentations with camera motion. Finally, we develop a synthesized dataset of chair objects with part labels and investigate the influence of parts on RGB-D based object pose recognition using a novel network architecture we call PartNet
- …