82 research outputs found

    3D Visual Perception for Self-Driving Cars using a Multi-Camera System: Calibration, Mapping, Localization, and Obstacle Detection

    Full text link
    Cameras are a crucial exteroceptive sensor for self-driving cars as they are low-cost and small, provide appearance information about the environment, and work in various weather conditions. They can be used for multiple purposes such as visual navigation and obstacle detection. We can use a surround multi-camera system to cover the full 360-degree field-of-view around the car. In this way, we avoid blind spots which can otherwise lead to accidents. To minimize the number of cameras needed for surround perception, we utilize fisheye cameras. Consequently, standard vision pipelines for 3D mapping, visual localization, obstacle detection, etc. need to be adapted to take full advantage of the availability of multiple cameras rather than treat each camera individually. In addition, processing of fisheye images has to be supported. In this paper, we describe the camera calibration and subsequent processing pipeline for multi-fisheye-camera systems developed as part of the V-Charge project. This project seeks to enable automated valet parking for self-driving cars. Our pipeline is able to precisely calibrate multi-camera systems, build sparse 3D maps for visual navigation, visually localize the car with respect to these maps, generate accurate dense maps, as well as detect obstacles based on real-time depth map extraction

    Vehicle localization with enhanced robustness for urban automated driving

    Get PDF

    Dense and Globally Consistent Multi-View Stereo

    Get PDF
    Multi-View Stereo (MVS) aims at reconstructing dense geometry of scenes from a set of overlapping images which are captured at different viewing angles. This thesis is devoted to addressing MVS problem by estimating depth maps, since 2D-space operations are trivially parallelizable in contrast to 3D volumetric techniques. Typical setup of depth-map-based MVS approaches consists of per-view calculation and multi-view merging. Most solutions primarily aim at the most precise and complete surfaces for individual views but relaxing the global geometry consistency. Therefore, the inconsistent estimates lead to heavy processing workload in the merging stage and diminish the final reconstruction. Another issue is the textureless areas where the photo-consistency constraint can not discriminate different depths. These matching ambiguities are normally handled by incorporating plane features or the smoothness assumption, that might produce segmentation effect or depends on accuracy and completeness of the calculated object edges. This thesis deals with two kinds of input data, photo collections and high-frame-rate videos, by developing distinct MVS algorithms based on their characteristics: For the sparsely sampled photos, we propose an advanced PatchMatch system that alternates between patch-based correlation maximization and pixel-based optimization of the cross-view consistency. Thereby we get a good trade-off between the photometric and geometric constraints. Moreover, our method achieves high efficiency by combining local pixel traversal and a hierarchical framework for fast depth propagation. For the densely sampled videos, we mainly focus on recovering the homogeneous surfaces, because the redundant scene information enables ray-level correlation which can generate shape depth discontinuities. Our approach infers smooth surfaces for the enclosed areas using perspective depth interpolation, and subsequently tackles the occlusion errors connecting the fore- and background edges. In addition, our edge depth estimation is more robust by accounting for unstructured camera trajectories. Exhaustively calculating depth maps is unfeasible when modeling large scenes from videos. This thesis further improves the reconstruction scalability using an incremental scheme via content-aware view selection and clustering. Our goal is to gradually eliminate the visibility conflicts and increase the surface coverage by processing a minimum subset of views. Constructing view clusters allows us to store merged and locally consistent points with the highest resolution, thus reducing the memory requirements. All approaches presented in the thesis do not rely on high-level techniques, so they can be easily parallelized. The evaluations on various datasets and the comparisons with existing algorithms demonstrate the superiority of our methods

    Reliable Navigation for SUAS in Complex Indoor Environments

    Get PDF
    Indoor environments are a particular challenge for Unmanned Aerial Vehicles (UAVs). Effective navigation through these GPS-denied environments require alternative localization systems, as well as methods of sensing and avoiding obstacles while remaining on-task. Additionally, the relatively small clearances and human presence characteristic of indoor spaces necessitates a higher level of precision and adaptability than is common in traditional UAV flight planning and execution. This research blends the optimization of individual technologies, such as state estimation and environmental sensing, with system integration and high-level operational planning. The combination of AprilTag visual markers, multi-camera Visual Odometry, and IMU data can be used to create a robust state estimator that describes position, velocity, and rotation of a multicopter within an indoor environment. However these data sources have unique, nonlinear characteristics that should be understood to effectively plan for their usage in an automated environment. The research described herein begins by analyzing the unique characteristics of these data streams in order to create a highly-accurate, fault-tolerant state estimator. Upon this foundation, the system built, tested, and described herein uses Visual Markers as navigation anchors, visual odometry for motion estimation and control, and then uses depth sensors to maintain an up-to-date map of the UAV\u27s immediate surroundings. It develops and continually refines navigable routes through a novel combination of pre-defined and sensory environmental data. Emphasis is put on the real-world development and testing of the system, through discussion of computational resource management and risk reduction

    Perception de la géométrie de l'environnement pour la navigation autonome

    Get PDF
    Le but de de la recherche en robotique mobile est de donner aux robots la capacité d'accomplir des missions dans un environnement qui n'est pas parfaitement connu. Mission, qui consiste en l'exécution d'un certain nombre d'actions élémentaires (déplacement, manipulation d'objets...) et qui nécessite une localisation précise, ainsi que la construction d'un bon modèle géométrique de l'environnement, a partir de l'exploitation de ses propres capteurs, des capteurs externes, de l'information provenant d'autres robots et de modèle existant, par exemple d'un système d'information géographique. L'information commune est la géométrie de l'environnement. La première partie du manuscrit couvre les différents méthodes d'extraction de l'information géométrique. La seconde partie présente la création d'un modèle géométrique en utilisant un graphe, ainsi qu'une méthode pour extraire de l'information du graphe et permettre au robot de se localiser dans l'environnement.The goal of the mobile robotic research is to give robots the capability to accomplish missions in an environment that might be unknown. To accomplish his mission, the robot need to execute a given set of elementary actions (movement, manipulation of objects...) which require an accurate localisation of the robot, as well as a the construction of good geometric model of the environment. Thus, a robot will need to take the most out of his own sensors, of external sensors, of information coming from an other robot and of existing model coming from a Geographic Information System. The common information is the geometry of the environment. The first part of the presentation will be about the different methods to extract geometric information. The second part will be about the creation of the geometric model using a graph structure, along with a method to retrieve information in the graph to allow the robot to localise itself in the environment

    Three-dimensional localization and mapping of static environments by means of mobile perception

    Get PDF
    Model-based task planning is one of the main capabilities of autonomous mobile robots. Especially for model-based localization and path planning, a large-scale description of the operation environment is required. Cognitive communication between man and his machine could be based on a common, three-dimensional understanding of the environment. In the case of a personal service robot, the operation environment may comprise both indoor and outdoor spaces. In this thesis, a method for the generation of a three-dimensional geometric model for large scale, structured and natural environments is presented. The environment mapping method, which uses range images as measurement data, consists of three main phases: first, geometric features are extracted from each of the range images. Secondly, the relative coordinate transformations (i.e. registrations) between the sensor viewpoint locations, where the range data was measured, are computed. And, finally, an integrated map is formed by transforming the sub-map data into a common frame of reference. Two types of geometric features are extracted from the range images: cylinder segments (or more generally truncated cone segments) and straight-line segments. With cylinder segments tree trunks and other elongated cylindrical objects can be modeled, whereas the straight line segments correspond to the upper corners of vertical walls. The features are utilized as natural landmarks for registration computation. The presented method is tested by mapping three test sites representing structured, semi-structured and natural environments. The structured environment corresponds to a part of the premises of an office building, the semi-structured environment corresponds to the surroundings of a parking lot and the natural environment is a small forest area. The dimensions of the test sites are about 50 meters, 120 meters and 40 meters square, respectively. A simple incremental approach is used to build an integrated model for the parking lot and office corridor environments. For the principal mapping experiment, concerning the small forest area, a statistically more sound, optimal approach is applied. With respect to the feature extraction methods and the computation of the relative coordinate transformations between the viewpoints, robustness to outlier data and failure modes of the methods are discussed in more detail.reviewe

    Cybergis-enabled remote sensing data analytics for deep learning of landscape patterns and dynamics

    Get PDF
    Mapping landscape patterns and dynamics is essential to various scientific domains and many practical applications. The availability of large-scale and high-resolution light detection and ranging (LiDAR) remote sensing data provides tremendous opportunities to unveil complex landscape patterns and better understand landscape dynamics from a 3D perspective. LiDAR data have been applied to diverse remote sensing applications where large-scale landscape mapping is among the most important topics. While researchers have used LiDAR for understanding landscape patterns and dynamics in many fields, to fully reap the benefits and potential of LiDAR is increasingly dependent on advanced cyberGIS and deep learning approaches. In this context, the central goal of this dissertation is to develop a suite of innovative cyberGIS-enabled deep-learning frameworks for combining LiDAR and optical remote sensing data to analyze landscape patterns and dynamics with four interrelated studies. The first study demonstrates a high-accuracy land-cover mapping method by integrating 3D information from LiDAR with multi-temporal remote sensing data using a 3D deep-learning model. The second study combines a point-based classification algorithm and an object-oriented change detection strategy for urban building change detection using deep learning. The third study develops a deep learning model for accurate hydrological streamline detection using LiDAR, which has paved a new way of harnessing LiDAR data to map landscape patterns and dynamics at unprecedented computational and spatiotemporal scales. The fourth study resolves computational challenges in handling remote sensing big data and deep learning of landscape feature extraction and classification through a cutting-edge cyberGIS approach

    Computational intelligence approaches to robotics, automation, and control [Volume guest editors]

    Get PDF
    No abstract available

    A collaborative monocular visual simultaneous localization and mapping solution to generate a semi-dense 3D map.

    Get PDF
    The utilization and generation of indoor maps are critical in accurate indoor tracking. Simultaneous Localization and Mapping (SLAM) is one of the main techniques used for such map generation. In SLAM, an agent generates a map of an unknown environment while approximating its own location in it. The prevalence and afford-ability of cameras encourage the use of Monocular Visual SLAM, where a camera is the only sensing device for the SLAM process. In modern applications, multiple mobile agents may be involved in the generation of indoor maps, thus requiring a distributed computational framework. Each agent generates its own local map, which can then be combined with those of other agents into a map covering a larger area. In doing so, they cover a given environment faster than a single agent. Furthermore, they can interact with each other in the same environment, making this framework more practical, especially for collaborative applications such as augmented reality. One of the main challenges of collaborative SLAM is identifying overlapping maps, especially when the relative starting positions of the agents are unknown. We propose a system comprised of multiple monocular agents with unknown relative starting positions to generate a semi-dense global map of the environment
    • …
    corecore