28,448 research outputs found

    Linear Global Translation Estimation with Feature Tracks

    Full text link
    This paper derives a novel linear position constraint for cameras seeing a common scene point, which leads to a direct linear method for global camera translation estimation. Unlike previous solutions, this method deals with collinear camera motion and weak image association at the same time. The final linear formulation does not involve the coordinates of scene points, which makes it efficient even for large scale data. We solve the linear equation based on L1L_1 norm, which makes our system more robust to outliers in essential matrices and feature correspondences. We experiment this method on both sequentially captured images and unordered Internet images. The experiments demonstrate its strength in robustness, accuracy, and efficiency.Comment: Changes: 1. Adopt BMVC2015 style; 2. Combine sections 3 and 5; 3. Move "Evaluation on synthetic data" out to supplementary file; 4. Divide subsection "Evaluation on general data" to subsections "Experiment on sequential data" and "Experiment on unordered Internet data"; 5. Change Fig. 1 and Fig.8; 6. Move Fig. 6 and Fig. 7 to supplementary file; 7 Change some symbols; 8. Correct some typo

    Learning Pose Estimation for UAV Autonomous Navigation and Landing Using Visual-Inertial Sensor Data

    Get PDF
    In this work, we propose a robust network-in-the-loop control system for autonomous navigation and landing of an Unmanned-Aerial-Vehicle (UAV). To estimate the UAV’s absolute pose, we develop a deep neural network (DNN) architecture for visual-inertial odometry, which provides a robust alternative to traditional methods. We first evaluate the accuracy of the estimation by comparing the prediction of our model to traditional visual-inertial approaches on the publicly available EuRoC MAV dataset. The results indicate a clear improvement in the accuracy of the pose estimation up to 25% over the baseline. Finally, we integrate the data-driven estimator in the closed-loop flight control system of Airsim, a simulator available as a plugin for Unreal Engine, and we provide simulation results for autonomous navigation and landing

    In-loop Feature Tracking for Structure and Motion with Out-of-core Optimization

    Get PDF
    In this paper, a novel and approach for obtaining 3D models from video sequences captured with hand-held cameras is addressed. We define a pipeline that robustly deals with different types of sequences and acquiring devices. Our system follows a divide and conquer approach: after a frame decimation that pre-conditions the input sequence, the video is split into short-length clips. This allows to parallelize the reconstruction step which translates into a reduction in the amount of computational resources required. The short length of the clips allows an intensive search for the best solution at each step of reconstruction which robustifies the system. The process of feature tracking is embedded within the reconstruction loop for each clip as opposed to other approaches. A final registration step, merges all the processed clips to the same coordinate fram

    GSLAM: Initialization-robust Monocular Visual SLAM via Global Structure-from-Motion

    Full text link
    Many monocular visual SLAM algorithms are derived from incremental structure-from-motion (SfM) methods. This work proposes a novel monocular SLAM method which integrates recent advances made in global SfM. In particular, we present two main contributions to visual SLAM. First, we solve the visual odometry problem by a novel rank-1 matrix factorization technique which is more robust to the errors in map initialization. Second, we adopt a recent global SfM method for the pose-graph optimization, which leads to a multi-stage linear formulation and enables L1 optimization for better robustness to false loops. The combination of these two approaches generates more robust reconstruction and is significantly faster (4X) than recent state-of-the-art SLAM systems. We also present a new dataset recorded with ground truth camera motion in a Vicon motion capture room, and compare our method to prior systems on it and established benchmark datasets.Comment: 3DV 2017 Project Page: https://frobelbest.github.io/gsla

    Attention and Anticipation in Fast Visual-Inertial Navigation

    Get PDF
    We study a Visual-Inertial Navigation (VIN) problem in which a robot needs to estimate its state using an on-board camera and an inertial sensor, without any prior knowledge of the external environment. We consider the case in which the robot can allocate limited resources to VIN, due to tight computational constraints. Therefore, we answer the following question: under limited resources, what are the most relevant visual cues to maximize the performance of visual-inertial navigation? Our approach has four key ingredients. First, it is task-driven, in that the selection of the visual cues is guided by a metric quantifying the VIN performance. Second, it exploits the notion of anticipation, since it uses a simplified model for forward-simulation of robot dynamics, predicting the utility of a set of visual cues over a future time horizon. Third, it is efficient and easy to implement, since it leads to a greedy algorithm for the selection of the most relevant visual cues. Fourth, it provides formal performance guarantees: we leverage submodularity to prove that the greedy selection cannot be far from the optimal (combinatorial) selection. Simulations and real experiments on agile drones show that our approach ensures state-of-the-art VIN performance while maintaining a lean processing time. In the easy scenarios, our approach outperforms appearance-based feature selection in terms of localization errors. In the most challenging scenarios, it enables accurate visual-inertial navigation while appearance-based feature selection fails to track robot's motion during aggressive maneuvers.Comment: 20 pages, 7 figures, 2 table

    Ultimate SLAM? Combining Events, Images, and IMU for Robust Visual SLAM in HDR and High Speed Scenarios

    Full text link
    Event cameras are bio-inspired vision sensors that output pixel-level brightness changes instead of standard intensity frames. These cameras do not suffer from motion blur and have a very high dynamic range, which enables them to provide reliable visual information during high speed motions or in scenes characterized by high dynamic range. However, event cameras output only little information when the amount of motion is limited, such as in the case of almost still motion. Conversely, standard cameras provide instant and rich information about the environment most of the time (in low-speed and good lighting scenarios), but they fail severely in case of fast motions, or difficult lighting such as high dynamic range or low light scenes. In this paper, we present the first state estimation pipeline that leverages the complementary advantages of these two sensors by fusing in a tightly-coupled manner events, standard frames, and inertial measurements. We show on the publicly available Event Camera Dataset that our hybrid pipeline leads to an accuracy improvement of 130% over event-only pipelines, and 85% over standard-frames-only visual-inertial systems, while still being computationally tractable. Furthermore, we use our pipeline to demonstrate - to the best of our knowledge - the first autonomous quadrotor flight using an event camera for state estimation, unlocking flight scenarios that were not reachable with traditional visual-inertial odometry, such as low-light environments and high-dynamic range scenes.Comment: 8 pages, 9 figures, 2 table
    • …
    corecore