191 research outputs found

    Improved 3D sparse maps for high-performance SFM with low-cost omnidirectional robots

    Get PDF
    We consider the use of low-budget omnidirectional platforms for 3D mapping and self-localisation. These robots specifically permit rotational motion in the plane around a central axis, with negligible displacement. In addition, low resolution and compressed imagery, typical of the platform used, results in high level of image noise (_ ∜ 10). We observe highly sparse image feature matches over narrow inter-image baselines. This particular configuration poses a challenge for epipolar geometry extraction and accurate 3D point triangulation, upon which a standard structure from motion formulation is based. We propose a novel technique for both feature filtering and tracking that solves these problems, via a novel approach to the management of feature bundles. Noisy matches are efficiently trimmed, and the scarcity of the remaining image features is adequately overcome, generating densely populated maps of highly accurate and robust 3D image features. The effectiveness of the approach is demonstrated under a variety of scenarios in experiments conducted with low-budget commercial robots

    Distributed scene reconstruction from multiple mobile platforms

    Get PDF
    Recent research on mobile robotics has produced new designs that provide house-hold robots with omnidirectional motion. The image sensor embedded in these devices motivates the application of 3D vision techniques on them for navigation and mapping purposes. In addition to this, distributed cheapsensing systems acting as unitary entity have recently been discovered as an efficient alternative to expensive mobile equipment. In this work we present an implementation of a visual reconstruction method, structure from motion (SfM), on a low-budget, omnidirectional mobile platform, and extend this method to distributed 3D scene reconstruction with several instances of such a platform. Our approach overcomes the challenges yielded by the plaform. The unprecedented levels of noise produced by the image compression typical of the platform is processed by our feature filtering methods, which ensure suitable feature matching populations for epipolar geometry estimation by means of a strict quality-based feature selection. The robust pose estimation algorithms implemented, along with a novel feature tracking system, enable our incremental SfM approach to novelly deal with ill-conditioned inter-image configurations provoked by the omnidirectional motion. The feature tracking system developed efficiently manages the feature scarcity produced by noise and outputs quality feature tracks, which allow robust 3D mapping of a given scene even if - due to noise - their length is shorter than what it is usually assumed for performing stable 3D reconstructions. The distributed reconstruction from multiple instances of SfM is attained by applying loop-closing techniques. Our multiple reconstruction system merges individual 3D structures and resolves the global scale problem with minimal overlaps, whereas in the literature 3D mapping is obtained by overlapping stretches of sequences. The performance of this system is demonstrated in the 2-session case. The management of noise, the stability against ill-configurations and the robustness of our SfM system is validated on a number of experiments and compared with state-of-the-art approaches. Possible future research areas are also discussed

    Structureless Camera Motion Estimation of Unordered Omnidirectional Images

    Get PDF
    This work aims at providing a novel camera motion estimation pipeline from large collections of unordered omnidirectional images. In oder to keep the pipeline as general and flexible as possible, cameras are modelled as unit spheres, allowing to incorporate any central camera type. For each camera an unprojection lookup is generated from intrinsics, which is called P2S-map (Pixel-to-Sphere-map), mapping pixels to their corresponding positions on the unit sphere. Consequently the camera geometry becomes independent of the underlying projection model. The pipeline also generates P2S-maps from world map projections with less distortion effects as they are known from cartography. Using P2S-maps from camera calibration and world map projection allows to convert omnidirectional camera images to an appropriate world map projection in oder to apply standard feature extraction and matching algorithms for data association. The proposed estimation pipeline combines the flexibility of SfM (Structure from Motion) - which handles unordered image collections - with the efficiency of PGO (Pose Graph Optimization), which is used as back-end in graph-based Visual SLAM (Simultaneous Localization and Mapping) approaches to optimize camera poses from large image sequences. SfM uses BA (Bundle Adjustment) to jointly optimize camera poses (motion) and 3d feature locations (structure), which becomes computationally expensive for large-scale scenarios. On the contrary PGO solves for camera poses (motion) from measured transformations between cameras, maintaining optimization managable. The proposed estimation algorithm combines both worlds. It obtains up-to-scale transformations between image pairs using two-view constraints, which are jointly scaled using trifocal constraints. A pose graph is generated from scaled two-view transformations and solved by PGO to obtain camera motion efficiently even for large image collections. Obtained results can be used as input data to provide initial pose estimates for further 3d reconstruction purposes e.g. to build a sparse structure from feature correspondences in an SfM or SLAM framework with further refinement via BA. The pipeline also incorporates fixed extrinsic constraints from multi-camera setups as well as depth information provided by RGBD sensors. The entire camera motion estimation pipeline does not need to generate a sparse 3d structure of the captured environment and thus is called SCME (Structureless Camera Motion Estimation).:1 Introduction 1.1 Motivation 1.1.1 Increasing Interest of Image-Based 3D Reconstruction 1.1.2 Underground Environments as Challenging Scenario 1.1.3 Improved Mobile Camera Systems for Full Omnidirectional Imaging 1.2 Issues 1.2.1 Directional versus Omnidirectional Image Acquisition 1.2.2 Structure from Motion versus Visual Simultaneous Localization and Mapping 1.3 Contribution 1.4 Structure of this Work 2 Related Work 2.1 Visual Simultaneous Localization and Mapping 2.1.1 Visual Odometry 2.1.2 Pose Graph Optimization 2.2 Structure from Motion 2.2.1 Bundle Adjustment 2.2.2 Structureless Bundle Adjustment 2.3 Corresponding Issues 2.4 Proposed Reconstruction Pipeline 3 Cameras and Pixel-to-Sphere Mappings with P2S-Maps 3.1 Types 3.2 Models 3.2.1 Unified Camera Model 3.2.2 Polynomal Camera Model 3.2.3 Spherical Camera Model 3.3 P2S-Maps - Mapping onto Unit Sphere via Lookup Table 3.3.1 Lookup Table as Color Image 3.3.2 Lookup Interpolation 3.3.3 Depth Data Conversion 4 Calibration 4.1 Overview of Proposed Calibration Pipeline 4.2 Target Detection 4.3 Intrinsic Calibration 4.3.1 Selected Examples 4.4 Extrinsic Calibration 4.4.1 3D-2D Pose Estimation 4.4.2 2D-2D Pose Estimation 4.4.3 Pose Optimization 4.4.4 Uncertainty Estimation 4.4.5 PoseGraph Representation 4.4.6 Bundle Adjustment 4.4.7 Selected Examples 5 Full Omnidirectional Image Projections 5.1 Panoramic Image Stitching 5.2 World Map Projections 5.3 World Map Projection Generator for P2S-Maps 5.4 Conversion between Projections based on P2S-Maps 5.4.1 Proposed Workflow 5.4.2 Data Storage Format 5.4.3 Real World Example 6 Relations between Two Camera Spheres 6.1 Forward and Backward Projection 6.2 Triangulation 6.2.1 Linear Least Squares Method 6.2.2 Alternative Midpoint Method 6.3 Epipolar Geometry 6.4 Transformation Recovery from Essential Matrix 6.4.1 Cheirality 6.4.2 Standard Procedure 6.4.3 Simplified Procedure 6.4.4 Improved Procedure 6.5 Two-View Estimation 6.5.1 Evaluation Strategy 6.5.2 Error Metric 6.5.3 Evaluation of Estimation Algorithms 6.5.4 Concluding Remarks 6.6 Two-View Optimization 6.6.1 Epipolar-Based Error Distances 6.6.2 Projection-Based Error Distances 6.6.3 Comparison between Error Distances 6.7 Two-View Translation Scaling 6.7.1 Linear Least Squares Estimation 6.7.2 Non-Linear Least Squares Optimization 6.7.3 Comparison between Initial and Optimized Scaling Factor 6.8 Homography to Identify Degeneracies 6.8.1 Homography for Spherical Cameras 6.8.2 Homography Estimation 6.8.3 Homography Optimization 6.8.4 Homography and Pure Rotation 6.8.5 Homography in Epipolar Geometry 7 Relations between Three Camera Spheres 7.1 Three View Geometry 7.2 Crossing Epipolar Planes Geometry 7.3 Trifocal Geometry 7.4 Relation between Trifocal, Three-View and Crossing Epipolar Planes 7.5 Translation Ratio between Up-To-Scale Two-View Transformations 7.5.1 Structureless Determination Approaches 7.5.2 Structure-Based Determination Approaches 7.5.3 Comparison between Proposed Approaches 8 Pose Graphs 8.1 Optimization Principle 8.2 Solvers 8.2.1 Additional Graph Solvers 8.2.2 False Loop Closure Detection 8.3 Pose Graph Generation 8.3.1 Generation of Synthetic Pose Graph Data 8.3.2 Optimization of Synthetic Pose Graph Data 9 Structureless Camera Motion Estimation 9.1 SCME Pipeline 9.2 Determination of Two-View Translation Scale Factors 9.3 Integration of Depth Data 9.4 Integration of Extrinsic Camera Constraints 10 Camera Motion Estimation Results 10.1 Directional Camera Images 10.2 Omnidirectional Camera Images 11 Conclusion 11.1 Summary 11.2 Outlook and Future Work Appendices A.1 Additional Extrinsic Calibration Results A.2 Linear Least Squares Scaling A.3 Proof Rank Deficiency A.4 Alternative Derivation Midpoint Method A.5 Simplification of Depth Calculation A.6 Relation between Epipolar and Circumferential Constraint A.7 Covariance Estimation A.8 Uncertainty Estimation from Epipolar Geometry A.9 Two-View Scaling Factor Estimation: Uncertainty Estimation A.10 Two-View Scaling Factor Optimization: Uncertainty Estimation A.11 Depth from Adjoining Two-View Geometries A.12 Alternative Three-View Derivation A.12.1 Second Derivation Approach A.12.2 Third Derivation Approach A.13 Relation between Trifocal Geometry and Alternative Midpoint Method A.14 Additional Pose Graph Generation Examples A.15 Pose Graph Solver Settings A.16 Additional Pose Graph Optimization Examples Bibliograph

    A multisensor SLAM for dense maps of large scale environments under poor lighting conditions

    Get PDF
    This thesis describes the development and implementation of a multisensor large scale autonomous mapping system for surveying tasks in underground mines. The hazardous nature of the underground mining industry has resulted in a push towards autonomous solutions to the most dangerous operations, including surveying tasks. Many existing autonomous mapping techniques rely on approaches to the Simultaneous Localization and Mapping (SLAM) problem which are not suited to the extreme characteristics of active underground mining environments. Our proposed multisensor system has been designed from the outset to address the unique challenges associated with underground SLAM. The robustness, self-containment and portability of the system maximize the potential applications.The multisensor mapping solution proposed as a result of this work is based on a fusion of omnidirectional bearing-only vision-based localization and 3D laser point cloud registration. By combining these two SLAM techniques it is possible to achieve some of the advantages of both approaches – the real-time attributes of vision-based SLAM and the dense, high precision maps obtained through 3D lasers. The result is a viable autonomous mapping solution suitable for application in challenging underground mining environments.A further improvement to the robustness of the proposed multisensor SLAM system is a consequence of incorporating colour information into vision-based localization. Underground mining environments are often dominated by dynamic sources of illumination which can cause inconsistent feature motion during localization. Colour information is utilized to identify and remove features resulting from illumination artefacts and to improve the monochrome based feature matching between frames.Finally, the proposed multisensor mapping system is implemented and evaluated in both above ground and underground scenarios. The resulting large scale maps contained a maximum offset error of ±30mm for mapping tasks with lengths over 100m

    Automatic Dense 3D Scene Mapping from Non-overlapping Passive Visual Sensors for Future Autonomous Systems

    Get PDF
    The ever increasing demand for higher levels of autonomy for robots and vehicles means there is an ever greater need for such systems to be aware of their surroundings. Whilst solutions already exist for creating 3D scene maps, many are based on active scanning devices such as laser scanners and depth cameras that are either expensive, unwieldy, or do not function well under certain environmental conditions. As a result passive cameras are a favoured sensor due their low cost, small size, and ability to work in a range of lighting conditions. In this work we address some of the remaining research challenges within the problem of 3D mapping around a moving platform. We utilise prior work in dense stereo imaging, Stereo Visual Odometry (SVO) and extend Structure from Motion (SfM) to create a pipeline optimised for on vehicle sensing. Using forward facing stereo cameras, we use state of the art SVO and dense stereo techniques to map the scene in front of the vehicle. With significant amounts of prior research in dense stereo, we addressed the issue of selecting an appropriate method by creating a novel evaluation technique. Visual 3D mapping of dynamic scenes from a moving platform result in duplicated scene objects. We extend the prior work on mapping by introducing a generalized dynamic object removal process. Unlike other approaches that rely on computationally expensive segmentation or detection, our method utilises existing data from the mapping stage and the findings from our dense stereo evaluation. We introduce a new SfM approach that exploits our platform motion to create a novel dense mapping process that exceeds the 3D data generation rate of state of the art alternatives. Finally, we combine dense stereo, SVO, and our SfM approach to automatically align point clouds from non-overlapping views to create a rotational and scale consistent global 3D model

    Unsupervised Odometry and Depth Learning for Endoscopic Capsule Robots

    Full text link
    In the last decade, many medical companies and research groups have tried to convert passive capsule endoscopes as an emerging and minimally invasive diagnostic technology into actively steerable endoscopic capsule robots which will provide more intuitive disease detection, targeted drug delivery and biopsy-like operations in the gastrointestinal(GI) tract. In this study, we introduce a fully unsupervised, real-time odometry and depth learner for monocular endoscopic capsule robots. We establish the supervision by warping view sequences and assigning the re-projection minimization to the loss function, which we adopt in multi-view pose estimation and single-view depth estimation network. Detailed quantitative and qualitative analyses of the proposed framework performed on non-rigidly deformable ex-vivo porcine stomach datasets proves the effectiveness of the method in terms of motion estimation and depth recovery.Comment: submitted to IROS 201

    Vision-based Assistive Indoor Localization

    Full text link
    An indoor localization system is of significant importance to the visually impaired in their daily lives by helping them localize themselves and further navigate an indoor environment. In this thesis, a vision-based indoor localization solution is proposed and studied with algorithms and their implementations by maximizing the usage of the visual information surrounding the users for an optimal localization from multiple stages. The contributions of the work include the following: (1) Novel combinations of a daily-used smart phone with a low-cost lens (GoPano) are used to provide an economic, portable, and robust indoor localization service for visually impaired people. (2) New omnidirectional features (omni-features) extracted from 360 degrees field-of-view images are proposed to represent visual landmarks of indoor positions, and then used as on-line query keys when a user asks for localization services. (3) A scalable and light-weight computation and storage solution is implemented by transferring big database storage and computational heavy querying procedure to the cloud. (4) Real-time query performance of 14 fps is achieved with a Wi-Fi connection by identifying and implementing both data and task parallelism using many-core NVIDIA GPUs. (5) Rene localization via 2D-to-3D and 3D-to-3D geometric matching and automatic path planning for efficient environmental modeling by utilizing architecture AutoCAD floor plans. This dissertation first provides a description of assistive indoor localization problem with its detailed connotations as well as overall methodology. Then related work in indoor localization and automatic path planning for environmental modeling is surveyed. After that, the framework of omnidirectional-vision-based indoor assistive localization is introduced. This is followed by multiple refine localization strategies such as 2D-to-3D and 3D-to-3D geometric matching approaches. Finally, conclusions and a few promising future research directions are provided

    Real-Time Multi-Fisheye Camera Self-Localization and Egomotion Estimation in Complex Indoor Environments

    Get PDF
    In this work a real-time capable multi-fisheye camera self-localization and egomotion estimation framework is developed. The thesis covers all aspects ranging from omnidirectional camera calibration to the development of a complete multi-fisheye camera SLAM system based on a generic multi-camera bundle adjustment method

    Visual Odometry and Sparse Scene Reconstruction for UAVs with a Multi-Fisheye Camera System

    Get PDF
    Autonomously operating UAVs demand a fast localization for navigation, to actively explore unknown areas and to create maps. For pose estimation, many UAV systems make use of a combination of GPS receivers and inertial sensor units (IMU). However, GPS signal coverage may go down occasionally, especially in the close vicinity of objects, and precise IMUs are too heavy to be carried by lightweight UAVs. This and the high cost of high quality IMU motivate the use of inexpensive vision based sensors for localization using visual odometry or visual SLAM (simultaneous localization and mapping) techniques. The first contribution of this thesis is a more general approach to bundle adjustment with an extended version of the projective coplanarity equation which enables us to make use of omnidirectional multi-camera systems which may consist of fisheye cameras that can capture a large field of view with one shot. We use ray directions as observations instead of image points which is why our approach does not rely on a specific projection model assuming a central projection. In addition, our approach allows the integration and estimation of points at infinity, which classical bundle adjustments are not capable of. We show that the integration of far or infinitely far points stabilizes the estimation of the rotation angles of the camera poses. In its second contribution, we employ this approach to bundle adjustment in a highly integrated system for incremental pose estimation and mapping on light-weight UAVs. Based on the image sequences of a multi-camera system our system makes use of tracked feature points to incrementally build a sparse map and incrementally refines this map using the iSAM2 algorithm. Our system is able to optionally integrate GPS information on the level of carrier phase observations even in underconstrained situations, e.g. if only two satellites are visible, for georeferenced pose estimation. This way, we are able to use all available information in underconstrained GPS situations to keep the mapped 3D model accurate and georeferenced. In its third contribution, we present an approach for re-using existing methods for dense stereo matching with fisheye cameras, which has the advantage that highly optimized existing methods can be applied as a black-box without modifications even with cameras that have field of view of more than 180 deg. We provide a detailed accuracy analysis of the obtained dense stereo results. The accuracy analysis shows the growing uncertainty of observed image points of fisheye cameras due to increasing blur towards the image border. Core of the contribution is a rigorous variance component estimation which allows to estimate the variance of the observed disparities at an image point as a function of the distance of that point to the principal point. We show that this improved stochastic model provides a more realistic prediction of the uncertainty of the triangulated 3D points.Autonom operierende UAVs benötigen eine schnelle Lokalisierung zur Navigation, zur Exploration unbekannter Umgebungen und zur Kartierung. Zur Posenbestimmung verwenden viele UAV-Systeme eine Kombination aus GPS-EmpfĂ€ngern und Inertial-Messeinheiten (IMU). Die VerfĂŒgbarkeit von GPS-Signalen ist jedoch nicht ĂŒberall gewĂ€hrleistet, insbesondere in der NĂ€he abschattender Objekte, und prĂ€zise IMUs sind fĂŒr leichtgewichtige UAVs zu schwer. Auch die hohen Kosten qualitativ hochwertiger IMUs motivieren den Einsatz von kostengĂŒnstigen bildgebenden Sensoren zur Lokalisierung mittels visueller Odometrie oder SLAM-Techniken zur simultanen Lokalisierung und Kartierung. Im ersten wissenschaftlichen Beitrag dieser Arbeit entwickeln wir einen allgemeineren Ansatz fĂŒr die BĂŒndelausgleichung mit einem erweiterten Modell fĂŒr die projektive KollinearitĂ€tsgleichung, sodass auch omnidirektionale Multikamerasysteme verwendet werden können, welche beispielsweise bestehend aus Fisheyekameras mit einer Aufnahme einen großen Sichtbereich abdecken. Durch die Integration von Strahlrichtungen als Beobachtungen ist unser Ansatz nicht von einem kameraspezifischen Abbildungsmodell abhĂ€ngig solange dieses der Zentralprojektion folgt. Zudem erlaubt unser Ansatz die Integration und SchĂ€tzung von unendlich fernen Punkten, was bei klassischen BĂŒndelausgleichungen nicht möglich ist. Wir zeigen, dass durch die Integration weit entfernter und unendlich ferner Punkte die SchĂ€tzung der Rotationswinkel der Kameraposen stabilisiert werden kann. Im zweiten Beitrag verwenden wir diesen entwickelten Ansatz zur BĂŒndelausgleichung fĂŒr ein System zur inkrementellen PosenschĂ€tzung und dĂŒnnbesetzten Kartierung auf einem leichtgewichtigen UAV. Basierend auf den Bildsequenzen eines Mulitkamerasystems baut unser System mittels verfolgter markanter Bildpunkte inkrementell eine dĂŒnnbesetzte Karte auf und verfeinert diese inkrementell mittels des iSAM2-Algorithmus. Unser System ist in der Lage optional auch GPS Informationen auf dem Level von GPS-TrĂ€gerphasen zu integrieren, wodurch sogar in unterbestimmten Situation - beispielsweise bei nur zwei verfĂŒgbaren Satelliten - diese Informationen zur georeferenzierten PosenschĂ€tzung verwendet werden können. Im dritten Beitrag stellen wir einen Ansatz zur Verwendung existierender Methoden fĂŒr dichtes Stereomatching mit Fisheyekameras vor, sodass hoch optimierte existierende Methoden als Black Box ohne Modifzierungen sogar mit Kameras mit einem Gesichtsfeld von mehr als 180 Grad verwendet werden können. Wir stellen eine detaillierte Genauigkeitsanalyse basierend auf dem Ergebnis des dichten Stereomatchings dar. Die Genauigkeitsanalyse zeigt, wie stark die Genauigkeit beobachteter Bildpunkte bei Fisheyekameras zum Bildrand aufgrund von zunehmender UnschĂ€rfe abnimmt. Das KernstĂŒck dieses Beitrags ist eine VarianzkomponentenschĂ€tzung, welche die SchĂ€tzung der Varianz der beobachteten DisparitĂ€ten an einem Bildpunkt als Funktion von der Distanz dieses Punktes zum Hauptpunkt des Bildes ermöglicht. Wir zeigen, dass dieses verbesserte stochastische Modell eine realistischere PrĂ€diktion der Genauigkeiten der 3D Punkte ermöglicht
    • 

    corecore