18,298 research outputs found
Robust Intrinsic and Extrinsic Calibration of RGB-D Cameras
Color-depth cameras (RGB-D cameras) have become the primary sensors in most
robotics systems, from service robotics to industrial robotics applications.
Typical consumer-grade RGB-D cameras are provided with a coarse intrinsic and
extrinsic calibration that generally does not meet the accuracy requirements
needed by many robotics applications (e.g., highly accurate 3D environment
reconstruction and mapping, high precision object recognition and localization,
...). In this paper, we propose a human-friendly, reliable and accurate
calibration framework that enables to easily estimate both the intrinsic and
extrinsic parameters of a general color-depth sensor couple. Our approach is
based on a novel two components error model. This model unifies the error
sources of RGB-D pairs based on different technologies, such as
structured-light 3D cameras and time-of-flight cameras. Our method provides
some important advantages compared to other state-of-the-art systems: it is
general (i.e., well suited for different types of sensors), based on an easy
and stable calibration protocol, provides a greater calibration accuracy, and
has been implemented within the ROS robotics framework. We report detailed
experimental validations and performance comparisons to support our statements
Stereo and ToF Data Fusion by Learning from Synthetic Data
Time-of-Flight (ToF) sensors and stereo vision systems are both capable of acquiring depth information but they have complementary characteristics and issues. A more accurate representation of the scene geometry can be obtained by fusing the two depth sources. In this paper we present a novel framework for data fusion where the contribution of the two depth sources is controlled by confidence measures that are jointly estimated using a Convolutional Neural Network. The two depth sources are fused enforcing the local consistency of depth data, taking into account the estimated confidence information. The deep network is trained using a synthetic dataset and we show how the classifier is able to generalize to different data, obtaining reliable estimations not only on synthetic data but also on real world scenes. Experimental results show that the proposed approach increases the accuracy of the depth estimation on both synthetic and real data and that it is able to outperform state-of-the-art methods
Flight Dynamics-based Recovery of a UAV Trajectory using Ground Cameras
We propose a new method to estimate the 6-dof trajectory of a flying object
such as a quadrotor UAV within a 3D airspace monitored using multiple fixed
ground cameras. It is based on a new structure from motion formulation for the
3D reconstruction of a single moving point with known motion dynamics. Our main
contribution is a new bundle adjustment procedure which in addition to
optimizing the camera poses, regularizes the point trajectory using a prior
based on motion dynamics (or specifically flight dynamics). Furthermore, we can
infer the underlying control input sent to the UAV's autopilot that determined
its flight trajectory.
Our method requires neither perfect single-view tracking nor appearance
matching across views. For robustness, we allow the tracker to generate
multiple detections per frame in each video. The true detections and the data
association across videos is estimated using robust multi-view triangulation
and subsequently refined during our bundle adjustment procedure. Quantitative
evaluation on simulated data and experiments on real videos from indoor and
outdoor scenes demonstrates the effectiveness of our method
Real-time on-board obstacle avoidance for UAVs based on embedded stereo vision
In order to improve usability and safety, modern unmanned aerial vehicles
(UAVs) are equipped with sensors to monitor the environment, such as
laser-scanners and cameras. One important aspect in this monitoring process is
to detect obstacles in the flight path in order to avoid collisions. Since a
large number of consumer UAVs suffer from tight weight and power constraints,
our work focuses on obstacle avoidance based on a lightweight stereo camera
setup. We use disparity maps, which are computed from the camera images, to
locate obstacles and to automatically steer the UAV around them. For disparity
map computation we optimize the well-known semi-global matching (SGM) approach
for the deployment on an embedded FPGA. The disparity maps are then converted
into simpler representations, the so called U-/V-Maps, which are used for
obstacle detection. Obstacle avoidance is based on a reactive approach which
finds the shortest path around the obstacles as soon as they have a critical
distance to the UAV. One of the fundamental goals of our work was the reduction
of development costs by closing the gap between application development and
hardware optimization. Hence, we aimed at using high-level synthesis (HLS) for
porting our algorithms, which are written in C/C++, to the embedded FPGA. We
evaluated our implementation of the disparity estimation on the KITTI Stereo
2015 benchmark. The integrity of the overall realtime reactive obstacle
avoidance algorithm has been evaluated by using Hardware-in-the-Loop testing in
conjunction with two flight simulators.Comment: Accepted in the International Archives of the Photogrammetry, Remote
Sensing and Spatial Information Scienc
Ultimate SLAM? Combining Events, Images, and IMU for Robust Visual SLAM in HDR and High Speed Scenarios
Event cameras are bio-inspired vision sensors that output pixel-level
brightness changes instead of standard intensity frames. These cameras do not
suffer from motion blur and have a very high dynamic range, which enables them
to provide reliable visual information during high speed motions or in scenes
characterized by high dynamic range. However, event cameras output only little
information when the amount of motion is limited, such as in the case of almost
still motion. Conversely, standard cameras provide instant and rich information
about the environment most of the time (in low-speed and good lighting
scenarios), but they fail severely in case of fast motions, or difficult
lighting such as high dynamic range or low light scenes. In this paper, we
present the first state estimation pipeline that leverages the complementary
advantages of these two sensors by fusing in a tightly-coupled manner events,
standard frames, and inertial measurements. We show on the publicly available
Event Camera Dataset that our hybrid pipeline leads to an accuracy improvement
of 130% over event-only pipelines, and 85% over standard-frames-only
visual-inertial systems, while still being computationally tractable.
Furthermore, we use our pipeline to demonstrate - to the best of our knowledge
- the first autonomous quadrotor flight using an event camera for state
estimation, unlocking flight scenarios that were not reachable with traditional
visual-inertial odometry, such as low-light environments and high-dynamic range
scenes.Comment: 8 pages, 9 figures, 2 table
Kinect Range Sensing: Structured-Light versus Time-of-Flight Kinect
Recently, the new Kinect One has been issued by Microsoft, providing the next
generation of real-time range sensing devices based on the Time-of-Flight (ToF)
principle. As the first Kinect version was using a structured light approach,
one would expect various differences in the characteristics of the range data
delivered by both devices. This paper presents a detailed and in-depth
comparison between both devices. In order to conduct the comparison, we propose
a framework of seven different experimental setups, which is a generic basis
for evaluating range cameras such as Kinect. The experiments have been designed
with the goal to capture individual effects of the Kinect devices as isolatedly
as possible and in a way, that they can also be adopted, in order to apply them
to any other range sensing device. The overall goal of this paper is to provide
a solid insight into the pros and cons of either device. Thus, scientists that
are interested in using Kinect range sensing cameras in their specific
application scenario can directly assess the expected, specific benefits and
potential problem of either device.Comment: 58 pages, 23 figures. Accepted for publication in Computer Vision and
Image Understanding (CVIU
- …