270 research outputs found

    Globally Consistent Indoor Mapping via a Decoupling Rotation and Translation Algorithm Applied to RGB-D Camera Output

    Full text link
    This paper presents a novel RGB-D 3D reconstruction algorithm for the indoor environment. The method can produce globally-consistent 3D maps for potential GIS applications. As the consumer RGB-D camera provides a noisy depth image, the proposed algorithm decouples the rotation and translation for a more robust camera pose estimation, which makes full use of the information, but also prevents inaccuracies caused by noisy depth measurements. The uncertainty in the image depth is not only related to the camera device, but also the environment; hence, a novel uncertainty model for depth measurements was developed using Gaussian mixture applied to multi-windows. The plane features in the indoor environment contain valuable information about the global structure, which can guide the convergence of camera pose solutions, and plane and feature point constraints are incorporated in the proposed optimization framework. The proposed method was validated using publicly-available RGB-D benchmarks and obtained good quality trajectory and 3D models, which are difficult for traditional 3D reconstruction algorithm

    Non-iterative RGB-D-inertial Odometry

    Full text link
    This paper presents a non-iterative solution to RGB-D-inertial odometry system. Traditional odometry methods resort to iterative algorithms which are usually computationally expensive or require well-designed initialization. To overcome this problem, this paper proposes to combine a non-iterative front-end (odometry) with an iterative back-end (loop closure) for the RGB-D-inertial SLAM system. The main contribution lies in the novel non-iterative front-end, which leverages on inertial fusion and kernel cross-correlators (KCC) to match point clouds in frequency domain. Dominated by the fast Fourier transform (FFT), our method is only of complexity O(nlogn)\mathcal{O}(n\log{n}), where nn is the number of points. Map fusion is conducted by element-wise operations, so that both time and space complexity are further reduced. Extensive experiments show that, due to the lightweight of the proposed front-end, the framework is able to run at a much faster speed yet still with comparable accuracy with the state-of-the-arts

    Autonomous Navigation in Complex Indoor and Outdoor Environments with Micro Aerial Vehicles

    Get PDF
    Micro aerial vehicles (MAVs) are ideal platforms for surveillance and search and rescue in confined indoor and outdoor environments due to their small size, superior mobility, and hover capability. In such missions, it is essential that the MAV is capable of autonomous flight to minimize operator workload. Despite recent successes in commercialization of GPS-based autonomous MAVs, autonomous navigation in complex and possibly GPS-denied environments gives rise to challenging engineering problems that require an integrated approach to perception, estimation, planning, control, and high level situational awareness. Among these, state estimation is the first and most critical component for autonomous flight, especially because of the inherently fast dynamics of MAVs and the possibly unknown environmental conditions. In this thesis, we present methodologies and system designs, with a focus on state estimation, that enable a light-weight off-the-shelf quadrotor MAV to autonomously navigate complex unknown indoor and outdoor environments using only onboard sensing and computation. We start by developing laser and vision-based state estimation methodologies for indoor autonomous flight. We then investigate fusion from heterogeneous sensors to improve robustness and enable operations in complex indoor and outdoor environments. We further propose estimation algorithms for on-the-fly initialization and online failure recovery. Finally, we present planning, control, and environment coverage strategies for integrated high-level autonomy behaviors. Extensive online experimental results are presented throughout the thesis. We conclude by proposing future research opportunities

    Exploiting Structural Regularities and Beyond: Vision-based Localization and Mapping in Man-Made Environments

    Get PDF
    Image-based estimation of camera motion, known as visual odometry (VO), plays a very important role in many robotic applications such as control and navigation of unmanned mobile robots, especially when no external navigation reference signal is available. The core problem of VO is the estimation of the camera’s ego-motion (i.e. tracking) either between successive frames, namely relative pose estimation, or with respect to a global map, namely absolute pose estimation. This thesis aims to develop efficient, accurate and robust VO solutions by taking advantage of structural regularities in man-made environments, such as piece-wise planar structures, Manhattan World and more generally, contours and edges. Furthermore, to handle challenging scenarios that are beyond the limits of classical sensor based VO solutions, we investigate a recently emerging sensor — the event camera and study on event-based mapping — one of the key problems in the event-based VO/SLAM. The main achievements are summarized as follows. First, we revisit an old topic on relative pose estimation: accurately and robustly estimating the fundamental matrix given a collection of independently estimated homograhies. Three classical methods are reviewed and then we show a simple but nontrivial two-step normalization within the direct linear method that achieves similar performance to the less attractive and more computationally intensive hallucinated points based method. Second, an efficient 3D rotation estimation algorithm for depth cameras in piece-wise planar environments is presented. It shows that by using surface normal vectors as an input, planar modes in the corresponding density distribution function can be discovered and continuously tracked using efficient non-parametric estimation techniques. The relative rotation can be estimated by registering entire bundles of planar modes by using robust L1-norm minimization. Third, an efficient alternative to the iterative closest point algorithm for real-time tracking of modern depth cameras in ManhattanWorlds is developed. We exploit the common orthogonal structure of man-made environments in order to decouple the estimation of the rotation and the three degrees of freedom of the translation. The derived camera orientation is absolute and thus free of long-term drift, which in turn benefits the accuracy of the translation estimation as well. Fourth, we look into a more general structural regularity—edges. A real-time VO system that uses Canny edges is proposed for RGB-D cameras. Two novel alternatives to classical distance transforms are developed with great properties that significantly improve the classical Euclidean distance field based methods in terms of efficiency, accuracy and robustness. Finally, to deal with challenging scenarios that go beyond what standard RGB/RGB-D cameras can handle, we investigate the recently emerging event camera and focus on the problem of 3D reconstruction from data captured by a stereo event-camera rig moving in a static scene, such as in the context of stereo Simultaneous Localization and Mapping

    ParticleSfM: Exploiting Dense Point Trajectories for Localizing Moving Cameras in the Wild

    Full text link
    Estimating the pose of a moving camera from monocular video is a challenging problem, especially due to the presence of moving objects in dynamic environments, where the performance of existing camera pose estimation methods are susceptible to pixels that are not geometrically consistent. To tackle this challenge, we present a robust dense indirect structure-from-motion method for videos that is based on dense correspondence initialized from pairwise optical flow. Our key idea is to optimize long-range video correspondence as dense point trajectories and use it to learn robust estimation of motion segmentation. A novel neural network architecture is proposed for processing irregular point trajectory data. Camera poses are then estimated and optimized with global bundle adjustment over the portion of long-range point trajectories that are classified as static. Experiments on MPI Sintel dataset show that our system produces significantly more accurate camera trajectories compared to existing state-of-the-art methods. In addition, our method is able to retain reasonable accuracy of camera poses on fully static scenes, which consistently outperforms strong state-of-the-art dense correspondence based methods with end-to-end deep learning, demonstrating the potential of dense indirect methods based on optical flow and point trajectories. As the point trajectory representation is general, we further present results and comparisons on in-the-wild monocular videos with complex motion of dynamic objects. Code is available at https://github.com/bytedance/particle-sfm.Comment: ECCV 2022. Project page: http://b1ueber2y.me/projects/ParticleSfM
    corecore