879 research outputs found
Overview of Environment Perception for Intelligent Vehicles
This paper presents a comprehensive literature review on environment perception for intelligent vehicles. The
state-of-the-art algorithms and modeling methods for intelligent
vehicles are given, with a summary of their pros and cons. A
special attention is paid to methods for lane and road detection,
traffic sign recognition, vehicle tracking, behavior analysis, and
scene understanding. In addition, we provide information about
datasets, common performance analysis, and perspectives on
future research directions in this area
Perception and intelligent localization for autonomous driving
Mestrado em Engenharia de Computadores e TelemáticaVisão por computador e fusão sensorial são temas relativamente recentes, no entanto largamente adoptados no desenvolvimento de robôs autónomos que exigem adaptabilidade ao seu ambiente envolvente. Esta dissertação foca-se numa abordagem a estes dois temas para alcançar percepção no contexto de condução autónoma. O uso de câmaras para atingir este fim é um
processo bastante complexo. Ao contrário dos meios sensoriais clássicos que fornecem sempre o mesmo tipo de informação precisa e atingida de forma determinística, as sucessivas imagens adquiridas por uma câmara estão repletas
da mais variada informação e toda esta ambígua e extremamente difícil de extrair. A utilização de câmaras como meio sensorial em robótica
é o mais próximo que chegamos na semelhança com aquele que é o de maior importância no processo de percepção humana, o sistema de visão. Visão por computador é uma disciplina científica que engloba àreas como: processamento
de sinal, inteligência artificial, matemática, teoria de controlo, neurobiologia e física.
A plataforma de suporte ao estudo desenvolvido no âmbito desta dissertação é o ROTA (RObô Triciclo Autónomo) e todos os elementos que consistem
o seu ambiente. No contexto deste, são descritas abordagens que foram introduzidas com fim de desenvolver soluções para todos os desafios que o
robô enfrenta no seu ambiente: detecção de linhas de estrada e consequente percepção desta, detecção de obstáculos, semáforos, zona da passadeira e zona de obras. É também descrito um sistema de calibração e aplicação da remoção da perspectiva da imagem, desenvolvido de modo a mapear os elementos percepcionados em distâncias reais. Em consequência do sistema
de percepção, é ainda abordado o desenvolvimento de auto-localização integrado
numa arquitectura distribuída incluindo navegação com planeamento inteligente. Todo o trabalho desenvolvido no decurso da dissertação é essencialmente centrado no desenvolvimento de percepção robótica no contexto de condução autónoma.Computer vision and sensor fusion are subjects that are quite recent, however widely adopted in the development of autonomous robots that require
adaptability to their surrounding environment. This thesis gives an approach on both in order to achieve perception in the scope of autonomous driving.
The use of cameras to achieve this goal is a rather complex subject.
Unlike the classic sensorial devices that provide the same type of information with precision and achieve this in a deterministic way, the successive
images acquired by a camera are replete with the most varied information, that this ambiguous and extremely dificult to extract. The use of cameras
for robotic sensing is the closest we got within the similarities with what is of most importance in the process of human perception, the vision system. Computer vision is a scientific discipline that encompasses areas such as signal processing, artificial intelligence, mathematics, control theory,
neurobiology and physics.
The support platform in which the study within this thesis was developed, includes ROTA (RObô Triciclo Autónomo) and all elements comprising its
environment. In its context, are described approaches that introduced in the platform in order to develop solutions for all the challenges facing the robot in its environment: detection of lane markings and its consequent perception, obstacle detection, trafic lights, crosswalk and road maintenance area. It is also described a calibration system and implementation for the removal of the image perspective, developed in order to map the
elements perceived in actual real world distances. As a result of the perception system development, it is also addressed self-localization integrated in
a distributed architecture that allows navigation with long term planning.
All the work developed in the course of this work is essentially focused on robotic perception in the context of autonomous driving
Online Monocular Lane Mapping Using Catmull-Rom Spline
In this study, we introduce an online monocular lane mapping approach that
solely relies on a single camera and odometry for generating spline-based maps.
Our proposed technique models the lane association process as an assignment
issue utilizing a bipartite graph, and assigns weights to the edges by
incorporating Chamfer distance, pose uncertainty, and lateral sequence
consistency. Furthermore, we meticulously design control point initialization,
spline parameterization, and optimization to progressively create, expand, and
refine splines. In contrast to prior research that assessed performance using
self-constructed datasets, our experiments are conducted on the openly
accessible OpenLane dataset. The experimental outcomes reveal that our
suggested approach enhances lane association and odometry precision, as well as
overall lane map quality. We have open-sourced our code1 for this project.Comment: Accepted by IROS202
EgoVM: Achieving Precise Ego-Localization using Lightweight Vectorized Maps
Accurate and reliable ego-localization is critical for autonomous driving. In
this paper, we present EgoVM, an end-to-end localization network that achieves
comparable localization accuracy to prior state-of-the-art methods, but uses
lightweight vectorized maps instead of heavy point-based maps. To begin with,
we extract BEV features from online multi-view images and LiDAR point cloud.
Then, we employ a set of learnable semantic embeddings to encode the semantic
types of map elements and supervise them with semantic segmentation, to make
their feature representation consistent with BEV features. After that, we feed
map queries, composed of learnable semantic embeddings and coordinates of map
elements, into a transformer decoder to perform cross-modality matching with
BEV features. Finally, we adopt a robust histogram-based pose solver to
estimate the optimal pose by searching exhaustively over candidate poses. We
comprehensively validate the effectiveness of our method using both the
nuScenes dataset and a newly collected dataset. The experimental results show
that our method achieves centimeter-level localization accuracy, and
outperforms existing methods using vectorized maps by a large margin.
Furthermore, our model has been extensively tested in a large fleet of
autonomous vehicles under various challenging urban scenes.Comment: 8 page
- …