695 research outputs found
Exploiting Sparse Semantic HD Maps for Self-Driving Vehicle Localization
In this paper we propose a novel semantic localization algorithm that
exploits multiple sensors and has precision on the order of a few centimeters.
Our approach does not require detailed knowledge about the appearance of the
world, and our maps require orders of magnitude less storage than maps utilized
by traditional geometry- and LiDAR intensity-based localizers. This is
important as self-driving cars need to operate in large environments. Towards
this goal, we formulate the problem in a Bayesian filtering framework, and
exploit lanes, traffic signs, as well as vehicle dynamics to localize robustly
with respect to a sparse semantic map. We validate the effectiveness of our
method on a new highway dataset consisting of 312km of roads. Our experiments
show that the proposed approach is able to achieve 0.05m lateral accuracy and
1.12m longitudinal accuracy on average while taking up only 0.3% of the storage
required by previous LiDAR intensity-based approaches.Comment: 8 pages, 4 figures, 4 tables, 2019 IEEE/RSJ International Conference
on Intelligent Robots and Systems (IROS 2019
EgoVM: Achieving Precise Ego-Localization using Lightweight Vectorized Maps
Accurate and reliable ego-localization is critical for autonomous driving. In
this paper, we present EgoVM, an end-to-end localization network that achieves
comparable localization accuracy to prior state-of-the-art methods, but uses
lightweight vectorized maps instead of heavy point-based maps. To begin with,
we extract BEV features from online multi-view images and LiDAR point cloud.
Then, we employ a set of learnable semantic embeddings to encode the semantic
types of map elements and supervise them with semantic segmentation, to make
their feature representation consistent with BEV features. After that, we feed
map queries, composed of learnable semantic embeddings and coordinates of map
elements, into a transformer decoder to perform cross-modality matching with
BEV features. Finally, we adopt a robust histogram-based pose solver to
estimate the optimal pose by searching exhaustively over candidate poses. We
comprehensively validate the effectiveness of our method using both the
nuScenes dataset and a newly collected dataset. The experimental results show
that our method achieves centimeter-level localization accuracy, and
outperforms existing methods using vectorized maps by a large margin.
Furthermore, our model has been extensively tested in a large fleet of
autonomous vehicles under various challenging urban scenes.Comment: 8 page
Semantic 3D Grid Maps for Autonomous Driving
Maps play a key role in rapidly developing area of autonomous driving. We
survey the literature for different map representations and find that while the
world is three-dimensional, it is common to rely on 2D map representations in
order to meet real-time constraints. We believe that high levels of situation
awareness require a 3D representation as well as the inclusion of semantic
information. We demonstrate that our recently presented hierarchical 3D grid
mapping framework UFOMap meets the real-time constraints. Furthermore, we show
how it can be used to efficiently support more complex functions such as
calculating the occluded parts of space and accumulating the output from a
semantic segmentation network.Comment: Submitted, accepted and presented at the 25th IEEE International
Conference on Intelligent Transportation Systems (IEEE ITSC 2022
BEV-Locator: An End-to-end Visual Semantic Localization Network Using Multi-View Images
Accurate localization ability is fundamental in autonomous driving.
Traditional visual localization frameworks approach the semantic map-matching
problem with geometric models, which rely on complex parameter tuning and thus
hinder large-scale deployment. In this paper, we propose BEV-Locator: an
end-to-end visual semantic localization neural network using multi-view camera
images. Specifically, a visual BEV (Birds-Eye-View) encoder extracts and
flattens the multi-view images into BEV space. While the semantic map features
are structurally embedded as map queries sequence. Then a cross-model
transformer associates the BEV features and semantic map queries. The
localization information of ego-car is recursively queried out by
cross-attention modules. Finally, the ego pose can be inferred by decoding the
transformer outputs. We evaluate the proposed method in large-scale nuScenes
and Qcraft datasets. The experimental results show that the BEV-locator is
capable to estimate the vehicle poses under versatile scenarios, which
effectively associates the cross-model information from multi-view images and
global semantic maps. The experiments report satisfactory accuracy with mean
absolute errors of 0.052m, 0.135m and 0.251 in lateral, longitudinal
translation and heading angle degree
- …