695 research outputs found

    Exploiting Sparse Semantic HD Maps for Self-Driving Vehicle Localization

    Full text link
    In this paper we propose a novel semantic localization algorithm that exploits multiple sensors and has precision on the order of a few centimeters. Our approach does not require detailed knowledge about the appearance of the world, and our maps require orders of magnitude less storage than maps utilized by traditional geometry- and LiDAR intensity-based localizers. This is important as self-driving cars need to operate in large environments. Towards this goal, we formulate the problem in a Bayesian filtering framework, and exploit lanes, traffic signs, as well as vehicle dynamics to localize robustly with respect to a sparse semantic map. We validate the effectiveness of our method on a new highway dataset consisting of 312km of roads. Our experiments show that the proposed approach is able to achieve 0.05m lateral accuracy and 1.12m longitudinal accuracy on average while taking up only 0.3% of the storage required by previous LiDAR intensity-based approaches.Comment: 8 pages, 4 figures, 4 tables, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2019

    EgoVM: Achieving Precise Ego-Localization using Lightweight Vectorized Maps

    Full text link
    Accurate and reliable ego-localization is critical for autonomous driving. In this paper, we present EgoVM, an end-to-end localization network that achieves comparable localization accuracy to prior state-of-the-art methods, but uses lightweight vectorized maps instead of heavy point-based maps. To begin with, we extract BEV features from online multi-view images and LiDAR point cloud. Then, we employ a set of learnable semantic embeddings to encode the semantic types of map elements and supervise them with semantic segmentation, to make their feature representation consistent with BEV features. After that, we feed map queries, composed of learnable semantic embeddings and coordinates of map elements, into a transformer decoder to perform cross-modality matching with BEV features. Finally, we adopt a robust histogram-based pose solver to estimate the optimal pose by searching exhaustively over candidate poses. We comprehensively validate the effectiveness of our method using both the nuScenes dataset and a newly collected dataset. The experimental results show that our method achieves centimeter-level localization accuracy, and outperforms existing methods using vectorized maps by a large margin. Furthermore, our model has been extensively tested in a large fleet of autonomous vehicles under various challenging urban scenes.Comment: 8 page

    Semantic 3D Grid Maps for Autonomous Driving

    Full text link
    Maps play a key role in rapidly developing area of autonomous driving. We survey the literature for different map representations and find that while the world is three-dimensional, it is common to rely on 2D map representations in order to meet real-time constraints. We believe that high levels of situation awareness require a 3D representation as well as the inclusion of semantic information. We demonstrate that our recently presented hierarchical 3D grid mapping framework UFOMap meets the real-time constraints. Furthermore, we show how it can be used to efficiently support more complex functions such as calculating the occluded parts of space and accumulating the output from a semantic segmentation network.Comment: Submitted, accepted and presented at the 25th IEEE International Conference on Intelligent Transportation Systems (IEEE ITSC 2022

    BEV-Locator: An End-to-end Visual Semantic Localization Network Using Multi-View Images

    Full text link
    Accurate localization ability is fundamental in autonomous driving. Traditional visual localization frameworks approach the semantic map-matching problem with geometric models, which rely on complex parameter tuning and thus hinder large-scale deployment. In this paper, we propose BEV-Locator: an end-to-end visual semantic localization neural network using multi-view camera images. Specifically, a visual BEV (Birds-Eye-View) encoder extracts and flattens the multi-view images into BEV space. While the semantic map features are structurally embedded as map queries sequence. Then a cross-model transformer associates the BEV features and semantic map queries. The localization information of ego-car is recursively queried out by cross-attention modules. Finally, the ego pose can be inferred by decoding the transformer outputs. We evaluate the proposed method in large-scale nuScenes and Qcraft datasets. The experimental results show that the BEV-locator is capable to estimate the vehicle poses under versatile scenarios, which effectively associates the cross-model information from multi-view images and global semantic maps. The experiments report satisfactory accuracy with mean absolute errors of 0.052m, 0.135m and 0.251∘^\circ in lateral, longitudinal translation and heading angle degree
    • …
    corecore