155 research outputs found

    MOZARD: Multi-Modal Localization for Autonomous Vehicles in Urban Outdoor Environments

    Full text link
    Visually poor scenarios are one of the main sources of failure in visual localization systems in outdoor environments. To address this challenge, we present MOZARD, a multi-modal localization system for urban outdoor environments using vision and LiDAR. By extending our preexisting key-point based visual multi-session local localization approach with the use of semantic data, an improved localization recall can be achieved across vastly different appearance conditions. In particular we focus on the use of curbstone information because of their broad distribution and reliability within urban environments. We present thorough experimental evaluations on several driving kilometers in challenging urban outdoor environments, analyze the recall and accuracy of our localization system and demonstrate in a case study possible failure cases of each subsystem. We demonstrate that MOZARD is able to bridge scenarios where our previous work VIZARD fails, hence yielding an increased recall performance, while a similar localization accuracy of 0.2m is achieve

    V2HDM-Mono: A Framework of Building a Marking-Level HD Map with One or More Monocular Cameras

    Full text link
    Marking-level high-definition maps (HD maps) are of great significance for autonomous vehicles, especially in large-scale, appearance-changing scenarios where autonomous vehicles rely on markings for localization and lanes for safe driving. In this paper, we propose a highly feasible framework for automatically building a marking-level HD map using a simple sensor setup (one or more monocular cameras). We optimize the position of the marking corners to fit the result of marking segmentation and simultaneously optimize the inverse perspective mapping (IPM) matrix of the corresponding camera to obtain an accurate transformation from the front view image to the bird's-eye view (BEV). In the quantitative evaluation, the built HD map almost attains centimeter-level accuracy. The accuracy of the optimized IPM matrix is similar to that of the manual calibration. The method can also be generalized to build HD maps in a broader sense by increasing the types of recognizable markings

    Vehicle Distance Detection Using Monocular Vision and Machine Learning

    Get PDF
    With the development of new cutting-edge technology, autonomous vehicles (AVs) have become the main topic in the majority of the automotive industries. For an AV to be safely used on the public roads it needs to be able to perceive its surrounding environment and calculate decisions within real-time. A perfect AV still does not exist for the majority of public use, but advanced driver assistance systems (ADAS) have been already integrated into everyday vehicles. It is predicted that these systems will evolve to work together to become a fully AV of the future. This thesis’ main focus is the combination of ADAS with artificial intelligence (AI) models. Since neural networks (NNs) could be unpredictable at many occasions, the main aspect of this thesis is the research of which neural network architecture will be most accurate in perceiving distance between vehicles. Hence, the study of integration of ADAS with AI, and studying whether AI can safely be used as a central processor for AV needs resolution. The created ADAS in this thesis mainly focuses on using monocular vision and machine training. A dataset of 200,000 images was used to train a neural network (NN) model, which accurately detect whether an image is a license plate or not by 96.75% accuracy. A sliding window reads whether a sub-section of an image is a license plate; the process achieved if it is, and the algorithm stores that sub-section image. The sub-images are run through a heatmap threshold to help minimize false detections. Upon detecting the license plate, the final algorithm determines the distance of the vehicle of the license plate detected. It then calculates the distance and outputs the data to the user. This process achieves results with up to a 1-meter distance accuracy. This ADAS has been aimed to be useable by the public, and easily integrated into future AV systems

    Robust ego-localization using monocular visual odometry

    Get PDF

    Near-field Perception for Low-Speed Vehicle Automation using Surround-view Fisheye Cameras

    Full text link
    Cameras are the primary sensor in automated driving systems. They provide high information density and are optimal for detecting road infrastructure cues laid out for human vision. Surround-view camera systems typically comprise of four fisheye cameras with 190{\deg}+ field of view covering the entire 360{\deg} around the vehicle focused on near-field sensing. They are the principal sensors for low-speed, high accuracy, and close-range sensing applications, such as automated parking, traffic jam assistance, and low-speed emergency braking. In this work, we provide a detailed survey of such vision systems, setting up the survey in the context of an architecture that can be decomposed into four modular components namely Recognition, Reconstruction, Relocalization, and Reorganization. We jointly call this the 4R Architecture. We discuss how each component accomplishes a specific aspect and provide a positional argument that they can be synergized to form a complete perception system for low-speed automation. We support this argument by presenting results from previous works and by presenting architecture proposals for such a system. Qualitative results are presented in the video at https://youtu.be/ae8bCOF77uY.Comment: Accepted for publication at IEEE Transactions on Intelligent Transportation System

    Simultaneous Localization and Mapping (SLAM) for Autonomous Driving: Concept and Analysis

    Get PDF
    The Simultaneous Localization and Mapping (SLAM) technique has achieved astonishing progress over the last few decades and has generated considerable interest in the autonomous driving community. With its conceptual roots in navigation and mapping, SLAM outperforms some traditional positioning and localization techniques since it can support more reliable and robust localization, planning, and controlling to meet some key criteria for autonomous driving. In this study the authors first give an overview of the different SLAM implementation approaches and then discuss the applications of SLAM for autonomous driving with respect to different driving scenarios, vehicle system components and the characteristics of the SLAM approaches. The authors then discuss some challenging issues and current solutions when applying SLAM for autonomous driving. Some quantitative quality analysis means to evaluate the characteristics and performance of SLAM systems and to monitor the risk in SLAM estimation are reviewed. In addition, this study describes a real-world road test to demonstrate a multi-sensor-based modernized SLAM procedure for autonomous driving. The numerical results show that a high-precision 3D point cloud map can be generated by the SLAM procedure with the integration of Lidar and GNSS/INS. Online four–five cm accuracy localization solution can be achieved based on this pre-generated map and online Lidar scan matching with a tightly fused inertial system

    BEV-Locator: An End-to-end Visual Semantic Localization Network Using Multi-View Images

    Full text link
    Accurate localization ability is fundamental in autonomous driving. Traditional visual localization frameworks approach the semantic map-matching problem with geometric models, which rely on complex parameter tuning and thus hinder large-scale deployment. In this paper, we propose BEV-Locator: an end-to-end visual semantic localization neural network using multi-view camera images. Specifically, a visual BEV (Birds-Eye-View) encoder extracts and flattens the multi-view images into BEV space. While the semantic map features are structurally embedded as map queries sequence. Then a cross-model transformer associates the BEV features and semantic map queries. The localization information of ego-car is recursively queried out by cross-attention modules. Finally, the ego pose can be inferred by decoding the transformer outputs. We evaluate the proposed method in large-scale nuScenes and Qcraft datasets. The experimental results show that the BEV-locator is capable to estimate the vehicle poses under versatile scenarios, which effectively associates the cross-model information from multi-view images and global semantic maps. The experiments report satisfactory accuracy with mean absolute errors of 0.052m, 0.135m and 0.251∘^\circ in lateral, longitudinal translation and heading angle degree
    • …
    corecore