98 research outputs found

    Tight Fusion of Events and Inertial Measurements for Direct Velocity Estimation

    Full text link
    Traditional visual-inertial state estimation targets absolute camera poses and spatial landmark locations while first-order kinematics are typically resolved as an implicitly estimated sub-state. However, this poses a risk in velocity-based control scenarios, as the quality of the estimation of kinematics depends on the stability of absolute camera and landmark coordinates estimation. To address this issue, we propose a novel solution to tight visual-inertial fusion directly at the level of first-order kinematics by employing a dynamic vision sensor instead of a normal camera. More specifically, we leverage trifocal tensor geometry to establish an incidence relation that directly depends on events and camera velocity, and demonstrate how velocity estimates in highly dynamic situations can be obtained over short time intervals. Noise and outliers are dealt with using a nested two-layer RANSAC scheme. Additionally, smooth velocity signals are obtained from a tight fusion with pre-integrated inertial signals using a sliding window optimizer. Experiments on both simulated and real data demonstrate that the proposed tight event-inertial fusion leads to continuous and reliable velocity estimation in highly dynamic scenarios independently of absolute coordinates. Furthermore, in extreme cases, it achieves more stable and more accurate estimation of kinematics than traditional, point-position-based visual-inertial odometry.Comment: Accepted by IEEE Transactions on Robotics (T-RO

    Structure from Motion with Higher-level Environment Representations

    Get PDF
    Computer vision is an important area focusing on understanding, extracting and using the information from vision-based sensor. It has many applications such as vision-based 3D reconstruction, simultaneous localization and mapping(SLAM) and data-driven understanding of the real world. Vision is a fundamental sensing modality in many different fields of application. While the traditional structure from motion mostly uses sparse point-based feature, this thesis aims to explore the possibility of using higher order feature representation. It starts with a joint work which uses straight line for feature representation and performs bundle adjustment with straight line parameterization. Then, we further try an even higher order representation where we use Bezier spline for parameterization. We start with a simple case where all contours are lying on the plane and uses Bezier splines to parametrize the curves in the background and optimize on both camera position and Bezier splines. For application, we present a complete end-to-end pipeline which produces meaningful dense 3D models from natural data of a 3D object: the target object is placed on a structured but unknown planar background that is modeled with splines. The data is captured using only a hand-held monocular camera. However, this application is limited to a planar scenario and we manage to push the parameterizations into real 3D. Following the potential of this idea, we introduce a more flexible higher-order extension of points that provide a general model for structural edges in the environment, no matter if straight or curved. Our model relies on linked B´ezier curves, the geometric intuition of which proves great benefits during parameter initialization and regularization. We present the first fully automatic pipeline that is able to generate spline-based representations without any human supervision. Besides a full graphical formulation of the problem, we introduce both geometric and photometric cues as well as higher-level concepts such overall curve visibility and viewing angle restrictions to automatically manage the correspondences in the graph. Results prove that curve-based structure from motion with splines is able to outperform state-of-the-art sparse feature-based methods, as well as to model curved edges in the environment

    Windowed Factorization and Merging

    Get PDF
    In this work, an online 3D reconstruction algorithm is proposed which attempts to solve the structure from motion problem for occluded and degenerate data. To deal with occlusion the temporal consistency of data within a limited window is used to compute local reconstructions. These local reconstructions are transformed and merged to obtain an estimation of the 3D object shape. The algorithm is shown to accurately reconstruct a rotating and translating artificial sphere and a rotating toy dinosaur from a video. The proposed algorithm (WIFAME) provides a versatile framework to deal with missing data in the structure from motion problem

    Super-resolution of 3-dimensional scenes

    Full text link
    Super-resolution is an image enhancement method that increases the resolution of images and video. Previously this technique could only be applied to 2D scenes. The super-resolution algorithm developed in this thesis creates high-resolution views of 3-dimensional scenes, using low-resolution images captured from varying, unknown positions

    Structureless Camera Motion Estimation of Unordered Omnidirectional Images

    Get PDF
    This work aims at providing a novel camera motion estimation pipeline from large collections of unordered omnidirectional images. In oder to keep the pipeline as general and flexible as possible, cameras are modelled as unit spheres, allowing to incorporate any central camera type. For each camera an unprojection lookup is generated from intrinsics, which is called P2S-map (Pixel-to-Sphere-map), mapping pixels to their corresponding positions on the unit sphere. Consequently the camera geometry becomes independent of the underlying projection model. The pipeline also generates P2S-maps from world map projections with less distortion effects as they are known from cartography. Using P2S-maps from camera calibration and world map projection allows to convert omnidirectional camera images to an appropriate world map projection in oder to apply standard feature extraction and matching algorithms for data association. The proposed estimation pipeline combines the flexibility of SfM (Structure from Motion) - which handles unordered image collections - with the efficiency of PGO (Pose Graph Optimization), which is used as back-end in graph-based Visual SLAM (Simultaneous Localization and Mapping) approaches to optimize camera poses from large image sequences. SfM uses BA (Bundle Adjustment) to jointly optimize camera poses (motion) and 3d feature locations (structure), which becomes computationally expensive for large-scale scenarios. On the contrary PGO solves for camera poses (motion) from measured transformations between cameras, maintaining optimization managable. The proposed estimation algorithm combines both worlds. It obtains up-to-scale transformations between image pairs using two-view constraints, which are jointly scaled using trifocal constraints. A pose graph is generated from scaled two-view transformations and solved by PGO to obtain camera motion efficiently even for large image collections. Obtained results can be used as input data to provide initial pose estimates for further 3d reconstruction purposes e.g. to build a sparse structure from feature correspondences in an SfM or SLAM framework with further refinement via BA. The pipeline also incorporates fixed extrinsic constraints from multi-camera setups as well as depth information provided by RGBD sensors. The entire camera motion estimation pipeline does not need to generate a sparse 3d structure of the captured environment and thus is called SCME (Structureless Camera Motion Estimation).:1 Introduction 1.1 Motivation 1.1.1 Increasing Interest of Image-Based 3D Reconstruction 1.1.2 Underground Environments as Challenging Scenario 1.1.3 Improved Mobile Camera Systems for Full Omnidirectional Imaging 1.2 Issues 1.2.1 Directional versus Omnidirectional Image Acquisition 1.2.2 Structure from Motion versus Visual Simultaneous Localization and Mapping 1.3 Contribution 1.4 Structure of this Work 2 Related Work 2.1 Visual Simultaneous Localization and Mapping 2.1.1 Visual Odometry 2.1.2 Pose Graph Optimization 2.2 Structure from Motion 2.2.1 Bundle Adjustment 2.2.2 Structureless Bundle Adjustment 2.3 Corresponding Issues 2.4 Proposed Reconstruction Pipeline 3 Cameras and Pixel-to-Sphere Mappings with P2S-Maps 3.1 Types 3.2 Models 3.2.1 Unified Camera Model 3.2.2 Polynomal Camera Model 3.2.3 Spherical Camera Model 3.3 P2S-Maps - Mapping onto Unit Sphere via Lookup Table 3.3.1 Lookup Table as Color Image 3.3.2 Lookup Interpolation 3.3.3 Depth Data Conversion 4 Calibration 4.1 Overview of Proposed Calibration Pipeline 4.2 Target Detection 4.3 Intrinsic Calibration 4.3.1 Selected Examples 4.4 Extrinsic Calibration 4.4.1 3D-2D Pose Estimation 4.4.2 2D-2D Pose Estimation 4.4.3 Pose Optimization 4.4.4 Uncertainty Estimation 4.4.5 PoseGraph Representation 4.4.6 Bundle Adjustment 4.4.7 Selected Examples 5 Full Omnidirectional Image Projections 5.1 Panoramic Image Stitching 5.2 World Map Projections 5.3 World Map Projection Generator for P2S-Maps 5.4 Conversion between Projections based on P2S-Maps 5.4.1 Proposed Workflow 5.4.2 Data Storage Format 5.4.3 Real World Example 6 Relations between Two Camera Spheres 6.1 Forward and Backward Projection 6.2 Triangulation 6.2.1 Linear Least Squares Method 6.2.2 Alternative Midpoint Method 6.3 Epipolar Geometry 6.4 Transformation Recovery from Essential Matrix 6.4.1 Cheirality 6.4.2 Standard Procedure 6.4.3 Simplified Procedure 6.4.4 Improved Procedure 6.5 Two-View Estimation 6.5.1 Evaluation Strategy 6.5.2 Error Metric 6.5.3 Evaluation of Estimation Algorithms 6.5.4 Concluding Remarks 6.6 Two-View Optimization 6.6.1 Epipolar-Based Error Distances 6.6.2 Projection-Based Error Distances 6.6.3 Comparison between Error Distances 6.7 Two-View Translation Scaling 6.7.1 Linear Least Squares Estimation 6.7.2 Non-Linear Least Squares Optimization 6.7.3 Comparison between Initial and Optimized Scaling Factor 6.8 Homography to Identify Degeneracies 6.8.1 Homography for Spherical Cameras 6.8.2 Homography Estimation 6.8.3 Homography Optimization 6.8.4 Homography and Pure Rotation 6.8.5 Homography in Epipolar Geometry 7 Relations between Three Camera Spheres 7.1 Three View Geometry 7.2 Crossing Epipolar Planes Geometry 7.3 Trifocal Geometry 7.4 Relation between Trifocal, Three-View and Crossing Epipolar Planes 7.5 Translation Ratio between Up-To-Scale Two-View Transformations 7.5.1 Structureless Determination Approaches 7.5.2 Structure-Based Determination Approaches 7.5.3 Comparison between Proposed Approaches 8 Pose Graphs 8.1 Optimization Principle 8.2 Solvers 8.2.1 Additional Graph Solvers 8.2.2 False Loop Closure Detection 8.3 Pose Graph Generation 8.3.1 Generation of Synthetic Pose Graph Data 8.3.2 Optimization of Synthetic Pose Graph Data 9 Structureless Camera Motion Estimation 9.1 SCME Pipeline 9.2 Determination of Two-View Translation Scale Factors 9.3 Integration of Depth Data 9.4 Integration of Extrinsic Camera Constraints 10 Camera Motion Estimation Results 10.1 Directional Camera Images 10.2 Omnidirectional Camera Images 11 Conclusion 11.1 Summary 11.2 Outlook and Future Work Appendices A.1 Additional Extrinsic Calibration Results A.2 Linear Least Squares Scaling A.3 Proof Rank Deficiency A.4 Alternative Derivation Midpoint Method A.5 Simplification of Depth Calculation A.6 Relation between Epipolar and Circumferential Constraint A.7 Covariance Estimation A.8 Uncertainty Estimation from Epipolar Geometry A.9 Two-View Scaling Factor Estimation: Uncertainty Estimation A.10 Two-View Scaling Factor Optimization: Uncertainty Estimation A.11 Depth from Adjoining Two-View Geometries A.12 Alternative Three-View Derivation A.12.1 Second Derivation Approach A.12.2 Third Derivation Approach A.13 Relation between Trifocal Geometry and Alternative Midpoint Method A.14 Additional Pose Graph Generation Examples A.15 Pose Graph Solver Settings A.16 Additional Pose Graph Optimization Examples Bibliograph

    Visual servoing of mobile robots using non-central catadioptric cameras

    Get PDF
    This paper presents novel contributions on image-based control of a mobile robot using a general catadioptric camera model. A catadioptric camera is usually made up by a combination of a conventional camera and a curved mirror resulting in an omnidirectional sensor capable of providing 360° panoramic views of a scene. Modeling such cameras has been the subject of significant research interest in the computer vision community leading to a deeper understanding of the image properties and also to different models for different types of configurations. Visual servoing applications using catadioptric cameras have essentially been using central cameras and the corresponding unified projection model. So far only in a few cases more general models have been used. In this paper we address the problem of visual servoing using the so-called radial model. The radial model can be applied to many camera configurations and in particular to non-central catadioptric systems with mirrors that are symmetric around an axis coinciding with the optical axis. In this case, we show that the radial model can be used with a non-central catadioptric camera to allow effective image-based visual servoing (IBVS) of a mobile robot. Using this model, which is valid for a large set of catadioptric cameras (central or non-central), new visual features are proposed to control the degrees of freedom of a mobile robot moving on a plane. In addition to several simulation results, a set of experiments was carried out on Robot Operating System (ROS)-based platform which validates the applicability, effectiveness and robustness of the proposed method for image-based control of a non-holonomic robot

    Visual Localization with Lines

    Get PDF
    Mobile robots must be able to derive their current location from sensor measurements in order to navigate fully autonomously. Positioning sensors like GPS output a global position but their precision is not sufficient for many applications; and indoors no GPS signal is received at all. Cameras provide information-rich data and are already used in many systems, e.g. for object detection and recognition. Therefore, this thesis investigates the possibility of additionally using cameras for localization. State-of-the-art methods are based on point observations but as man-made environments mostly consist of planar and linear structures which are perceived as lines, the focus in this thesis is on the use of image lines to derive the camera trajectory. To achieve this goal, multiple view geometry algorithms for line-based pose and structure estimation have to be developed. A prerequisite for these algorithms is that correspondences between line observations in multiple images which originate from the same spatial line are established. This thesis proposes a novel line matching algorithm for matching under small baseline motion which is designed with one-to-many matching in mind to tackle the issue of varying line segmentation. In contrast to other line matching solutions, the algorithm proposed leverages optical flow calculation and hence obviates the need for an expensive descriptor calculation. A two-view relative pose estimation algorithm is introduced which extracts the spatial line directions using parallel line clustering on the image lines in order to calculate the relative rotation. In lieu of the "Manhattan world" assumption, which is required by state-of-the-art methods, the approach proposed is less restrictive as it needs only lines of different directions; the angle between the directions is not relevant. In addition, the method proposed is in the order of one magnitude faster to compute. A novel line triangulation method is proposed to derive the scene structure from the images. The method is derived from the spatial transformation of Plücker lines and allows prior knowledge of the spatial line, like the precalculated directions from the parallel line clustering, to be integrated. The problem of degenerate configurations is analyzed, too, and a solution is developed which incorporates the optical flow vectors from the matching step as spatial points into the estimation. Lastly, all components are combined to a visual odometry pipeline for monocular cameras. The pipeline uses image-to-image motion estimation to calculate the camera trajectory. A scale adjustment based on the trifocal tensor is introduced which ensures the consistent scale of the trajectory. To increase the robustness, a sliding-window bundle adjustment is employed. All components and the visual odometry pipeline proposed are evaluated and compared to state-of-the-art methods on real world data of indoor and outdoor scenes. The evaluation shows that line-based visual localization is suitable to solve the localization task

    Computationally-efficient visual inertial odometry for autonomous vehicle

    Get PDF
    This thesis presents the design, implementation, and validation of a novel nonlinearfiltering based Visual Inertial Odometry (VIO) framework for robotic navigation in GPSdenied environments. The system attempts to track the vehicle’s ego-motion at each time instant while capturing the benefits of both the camera information and the Inertial Measurement Unit (IMU). VIO demands considerable computational resources and processing time, and this makes the hardware implementation quite challenging for micro- and nanorobotic systems. In many cases, the VIO process selects a small subset of tracked features to reduce the computational cost. VIO estimation also suffers from the inevitable accumulation of error. This limitation makes the estimation gradually diverge and even fail to track the vehicle trajectory over long-term operation. Deploying optimization for the entire trajectory helps to minimize the accumulative errors, but increases the computational cost significantly. The VIO hardware implementation can utilize a more powerful processor and specialized hardware computing platforms, such as Field Programmable Gate Arrays, Graphics Processing Units and Application-Specific Integrated Circuits, to accelerate the execution. However, the computation still needs to perform identical computational steps with similar complexity. Processing data at a higher frequency increases energy consumption significantly. The development of advanced hardware systems is also expensive and time-consuming. Consequently, the approach of developing an efficient algorithm will be beneficial with or without hardware acceleration. The research described in this thesis proposes multiple solutions to accelerate the visual inertial odometry computation while maintaining a comparative estimation accuracy over long-term operation among state-ofthe- art algorithms. This research has resulted in three significant contributions. First, this research involved the design and validation of a novel nonlinear filtering sensor-fusion algorithm using trifocal tensor geometry and a cubature Kalman filter. The combination has handled the system nonlinearity effectively, while reducing the computational cost and system complexity significantly. Second, this research develops two solutions to address the error accumulation issue. For standalone self-localization projects, the first solution applies a local optimization procedure for the measurement update, which performs multiple corrections on a single measurement to optimize the latest filter state and covariance. For larger navigation projects, the second solution integrates VIO with additional pseudo-ranging measurements between the vehicle and multiple beacons in order to bound the accumulative errors. Third, this research develops a novel parallel-processing VIO algorithm to speed up the execution using a multi-core CPU. This allows the distribution of the filtering computation on each core to process and optimize each feature measurement update independently. The performance of the proposed visual inertial odometry framework is evaluated using publicly-available self-localization datasets, for comparison with some other open-source algorithms. The results illustrate that a proposed VIO framework is able to improve the VIO’s computational efficiency without the installation of specialized hardware computing platforms and advanced software libraries

    Practical Euclidean reconstruction of buildings.

    Get PDF
    Chou Yun-Sum, Bailey.Thesis (M.Phil.)--Chinese University of Hong Kong, 2001.Includes bibliographical references (leaves 89-92).Abstracts in English and Chinese.List of SymbolChapter Chapter 1 --- IntroductionChapter 1.1 --- The Goal: Euclidean Reconstruction --- p.1Chapter 1.2 --- Historical background --- p.2Chapter 1.3 --- Scope of the thesis --- p.2Chapter 1.4 --- Thesis Outline --- p.3Chapter Chapter 2 --- An introduction to stereo vision and 3D shape reconstructionChapter 2.1 --- Homogeneous Coordinates --- p.4Chapter 2.2 --- Camera ModelChapter 2.2.1 --- Pinhole Camera Model --- p.5Chapter 2.3 --- Camera Calibration --- p.11Chapter 2.4 --- Geometry of Binocular System --- p.14Chapter 2.5 --- Stereo Matching --- p.15Chapter 2.5.1 --- Accuracy of Corresponding Point --- p.17Chapter 2.5.2 --- The Stereo Matching Approach --- p.18Chapter --- Intensity-based stereo matching --- p.19Chapter --- Feature-based stereo matching --- p.20Chapter 2.5.3 --- Matching Constraints --- p.20Chapter 2.6 --- 3D Reconstruction --- p.22Chapter 2.7 --- Recent development on self calibration --- p.24Chapter 2.8 --- Summary of the Chapter --- p.25Chapter Chapter 3 --- Camera CalibrationChapter 3.1 --- Introduction --- p.26Chapter 3.2 --- Camera Self-calibration --- p.27Chapter 3.3 --- Self-calibration under general camera motion --- p.27Chapter 3.3.1 --- The absolute Conic Based Techniques --- p.28Chapter 3.3.2 --- A Stratified approach for self-calibration by Pollefeys --- p.33Chapter 3.3.3 --- Pollefeys self-calibration with Absolute Quadric --- p.34Chapter 3.3.4 --- Newsam's self-calibration with linear algorithm --- p.34Chapter 3.4 --- Camera Self-calibration under specially designed motion sequenceChapter 3.4. 1 --- Hartley's self-calibration by pure rotations --- p.35Chapter --- Summary of the AlgorithmChapter 3.4.2 --- Pollefeys self-calibration with variant focal length --- p.36Chapter --- Summary of the AlgorithmChapter 3.4.3 --- Faugeras self-calibration of a 1D Projective Camera --- p.38Chapter 3.5 --- Summary of the Chapter --- p.39Chapter Chapter 4 --- Self-calibration under Planar motionsChapter 4.1 --- Introduction --- p.40Chapter 4.2 --- 1D Projective Camera Self-calibration --- p.41Chapter 4.2.1 --- 1-D camera model --- p.42Chapter 4.2.2 --- 1-D Projective Camera Self-calibration Algorithms --- p.44Chapter 4.2.3 --- Planar motion detection --- p.45Chapter 4.2.4 --- Self-calibration under horizontal planar motions --- p.46Chapter 4.2.5 --- Self-calibration under three different planar motions --- p.47Chapter 4.2.6 --- Result analysis on self-calibration Experiments --- p.49Chapter 4.3 --- Essential Matrix and Triangulation --- p.51Chapter 4.4 --- Merge of Partial 3D models --- p.51Chapter 4.5 --- Summary of the Reconstruction Algorithms --- p.53Chapter 4.6 --- Experimental ResultsChapter 4.6.1 --- Experiment 1 : A Simulated Box --- p.54Chapter 4.6.2 --- Experiment 2 : A Real Building --- p.57Chapter 4.6.3 --- Experiment 3 : A Sun Flower --- p.58Chapter 4.7 --- Conclusion --- p.59Chapter Chapter 5 --- Building Reconstruction using a linear camera self- calibration techniqueChapter 5.1 --- Introduction --- p.60Chapter 5.2 --- Metric Reconstruction from Partially Calibrated imageChapter 5.2.1 --- Partially Calibrated Camera --- p.62Chapter 5.2.2 --- Optimal Computation of Fundamental Matrix (F) --- p.63Chapter 5.2.3 --- Linearly Recovering Two Focal Lengths from F --- p.64Chapter 5.2.4 --- Essential Matrix and Triangulation --- p.66Chapter 5.3 --- Experiments and Discussions --- p.67Chapter 5.4 --- Conclusion --- p.71Chapter Chapter 6 --- Refine the basic model with detail depth information by a Model-Based Stereo techniqueChapter 6.1 --- Introduction --- p.72Chapter 6.2 --- Model Based Epipolar GeometryChapter 6.2.1 --- Overview --- p.74Chapter 6.2.2 --- Warped offset image preparation --- p.76Chapter 6.2.3 --- Epipolar line calculation --- p.78Chapter 6.2.4 --- Actual corresponding point finding by stereo matching --- p.80Chapter 6.2.5 --- Actual 3D point generated by Triangulation --- p.80Chapter 6.3 --- Summary of the Algorithms --- p.81Chapter 6.4 --- Experiments and discussions --- p.83Chapter 6.5 --- Conclusion --- p.85Chapter Chapter 7 --- ConclusionsChapter 7.1 --- Summary --- p.86Chapter 7.2 --- Future Work --- p.88BIBLIOGRAPHY --- p.8
    • …