270 research outputs found
Globally Consistent Indoor Mapping via a Decoupling Rotation and Translation Algorithm Applied to RGB-D Camera Output
This paper presents a novel RGB-D 3D reconstruction algorithm for the indoor environment. The method can produce globally-consistent 3D maps for potential GIS applications. As the consumer RGB-D camera provides a noisy depth image, the proposed algorithm decouples the rotation and translation for a more robust camera pose estimation, which makes full use of the information, but also prevents inaccuracies caused by noisy depth measurements. The uncertainty in the image depth is not only related to the camera device, but also the environment; hence, a novel uncertainty model for depth measurements was developed using Gaussian mixture applied to multi-windows. The plane features in the indoor environment contain valuable information about the global structure, which can guide the convergence of camera pose solutions, and plane and feature point constraints are incorporated in the proposed optimization framework. The proposed method was validated using publicly-available RGB-D benchmarks and obtained good quality trajectory and 3D models, which are difficult for traditional 3D reconstruction algorithm
Non-iterative RGB-D-inertial Odometry
This paper presents a non-iterative solution to RGB-D-inertial odometry
system. Traditional odometry methods resort to iterative algorithms which are
usually computationally expensive or require well-designed initialization. To
overcome this problem, this paper proposes to combine a non-iterative front-end
(odometry) with an iterative back-end (loop closure) for the RGB-D-inertial
SLAM system. The main contribution lies in the novel non-iterative front-end,
which leverages on inertial fusion and kernel cross-correlators (KCC) to match
point clouds in frequency domain. Dominated by the fast Fourier transform
(FFT), our method is only of complexity , where is
the number of points. Map fusion is conducted by element-wise operations, so
that both time and space complexity are further reduced. Extensive experiments
show that, due to the lightweight of the proposed front-end, the framework is
able to run at a much faster speed yet still with comparable accuracy with the
state-of-the-arts
Autonomous Navigation in Complex Indoor and Outdoor Environments with Micro Aerial Vehicles
Micro aerial vehicles (MAVs) are ideal platforms for surveillance and search and rescue in confined indoor and outdoor environments due to their small size, superior mobility, and hover capability. In such missions, it is essential that the MAV is capable of autonomous flight to minimize operator workload. Despite recent successes in commercialization of GPS-based autonomous MAVs, autonomous navigation in complex and possibly GPS-denied environments gives rise to challenging engineering problems that require an integrated approach to perception, estimation, planning, control, and high level situational awareness. Among these, state estimation is the first and most critical component for autonomous flight, especially because of the inherently fast dynamics of MAVs and the possibly unknown environmental conditions. In this thesis, we present methodologies and system designs, with a focus on state estimation, that enable a light-weight off-the-shelf quadrotor MAV to autonomously navigate complex unknown indoor and outdoor environments using only onboard sensing and computation. We start by developing laser and vision-based state estimation methodologies for indoor autonomous flight. We then investigate fusion from heterogeneous sensors to improve robustness and enable operations in complex indoor and outdoor environments. We further propose estimation algorithms for on-the-fly initialization and online failure recovery. Finally, we present planning, control, and environment coverage strategies for integrated high-level autonomy behaviors. Extensive online experimental results are presented throughout the thesis. We conclude by proposing future research opportunities
Exploiting Structural Regularities and Beyond: Vision-based Localization and Mapping in Man-Made Environments
Image-based estimation of camera motion, known as visual odometry
(VO), plays a very important role in many robotic applications
such as control and navigation of unmanned mobile robots,
especially when no external navigation reference signal is
available. The core problem of VO is the estimation of the
camera’s ego-motion (i.e. tracking) either between successive
frames, namely relative pose estimation, or with respect to a
global map, namely absolute pose estimation. This thesis aims to
develop efficient, accurate and robust VO solutions by taking
advantage of structural regularities in man-made environments,
such as piece-wise planar structures, Manhattan World and more
generally, contours and edges. Furthermore, to handle challenging
scenarios that are beyond the limits of classical sensor based VO
solutions, we investigate a recently emerging sensor — the
event camera and study on event-based mapping — one of the key
problems in the event-based VO/SLAM. The main achievements are
summarized as follows.
First, we revisit an old topic on relative pose estimation:
accurately and robustly estimating the fundamental matrix given a
collection of independently estimated homograhies. Three
classical methods are reviewed and then we show a simple but
nontrivial two-step normalization
within the direct linear method that achieves similar performance
to the less attractive and more computationally intensive
hallucinated points based method.
Second, an efficient 3D rotation estimation algorithm for depth
cameras in piece-wise planar environments is presented. It shows
that by using surface normal vectors as an input, planar modes in
the corresponding density distribution function can be discovered
and continuously
tracked using efficient non-parametric estimation techniques. The
relative rotation can be estimated by registering entire bundles
of planar modes by using robust L1-norm minimization.
Third, an efficient alternative to the iterative closest point
algorithm for real-time tracking of modern depth cameras in
ManhattanWorlds is developed. We exploit the common orthogonal
structure of man-made environments in order to decouple the
estimation of the rotation and the three degrees of freedom of
the translation. The derived camera orientation is absolute and
thus free of long-term drift, which in turn benefits the accuracy
of the translation estimation as well.
Fourth, we look into a more general structural
regularity—edges. A real-time VO system that uses Canny edges
is proposed for RGB-D cameras. Two novel alternatives to
classical distance transforms are developed with great properties
that significantly improve the classical Euclidean distance field
based methods in terms of efficiency, accuracy and robustness.
Finally, to deal with challenging scenarios that go beyond what
standard RGB/RGB-D cameras can handle, we investigate the
recently emerging event camera and focus on the problem of 3D
reconstruction from data captured by a stereo event-camera rig
moving in a static
scene, such as in the context of stereo Simultaneous Localization
and Mapping
ParticleSfM: Exploiting Dense Point Trajectories for Localizing Moving Cameras in the Wild
Estimating the pose of a moving camera from monocular video is a challenging
problem, especially due to the presence of moving objects in dynamic
environments, where the performance of existing camera pose estimation methods
are susceptible to pixels that are not geometrically consistent. To tackle this
challenge, we present a robust dense indirect structure-from-motion method for
videos that is based on dense correspondence initialized from pairwise optical
flow. Our key idea is to optimize long-range video correspondence as dense
point trajectories and use it to learn robust estimation of motion
segmentation. A novel neural network architecture is proposed for processing
irregular point trajectory data. Camera poses are then estimated and optimized
with global bundle adjustment over the portion of long-range point trajectories
that are classified as static. Experiments on MPI Sintel dataset show that our
system produces significantly more accurate camera trajectories compared to
existing state-of-the-art methods. In addition, our method is able to retain
reasonable accuracy of camera poses on fully static scenes, which consistently
outperforms strong state-of-the-art dense correspondence based methods with
end-to-end deep learning, demonstrating the potential of dense indirect methods
based on optical flow and point trajectories. As the point trajectory
representation is general, we further present results and comparisons on
in-the-wild monocular videos with complex motion of dynamic objects. Code is
available at https://github.com/bytedance/particle-sfm.Comment: ECCV 2022. Project page: http://b1ueber2y.me/projects/ParticleSfM
- …