139 research outputs found
Percepção do ambiente urbano e navegação usando visĂŁo robĂłtica : concepção e implementação aplicado Ă veĂculo autĂ´nomo
Orientadores: Janito Vaqueiro Ferreira, Alessandro CorrĂŞa VictorinoTese (doutorado) - Universidade Estadual de Campinas, Faculdade de Engenharia MecânicaResumo: O desenvolvimento de veĂculos autĂ´nomos capazes de se locomover em ruas urbanas pode proporcionar importantes benefĂcios na redução de acidentes, no aumentando da qualidade de vida e tambĂ©m na redução de custos. VeĂculos inteligentes, por exemplo, frequentemente baseiam suas decisões em observações obtidas a partir de vários sensores tais como LIDAR, GPS e câmeras. Atualmente, sensores de câmera tĂŞm recebido grande atenção pelo motivo de que eles sĂŁo de baixo custo, fáceis de utilizar e fornecem dados com rica informação. Ambientes urbanos representam um interessante mas tambĂ©m desafiador cenário neste contexto, onde o traçado das ruas podem ser muito complexos, a presença de objetos tais como árvores, bicicletas, veĂculos podem gerar observações parciais e tambĂ©m estas observações sĂŁo muitas vezes ruidosas ou ainda perdidas devido a completas oclusões. Portanto, o processo de percepção por natureza precisa ser capaz de lidar com a incerteza no conhecimento do mundo em torno do veĂculo. Nesta tese, este problema de percepção Ă© analisado para a condução nos ambientes urbanos associado com a capacidade de realizar um deslocamento seguro baseado no processo de tomada de decisĂŁo em navegação autĂ´noma. Projeta-se um sistema de percepção que permita veĂculos robĂłticos a trafegar autonomamente nas ruas, sem a necessidade de adaptar a infraestrutura, sem o conhecimento prĂ©vio do ambiente e considerando a presença de objetos dinâmicos tais como veĂculos. Propõe-se um novo mĂ©todo baseado em aprendizado de máquina para extrair o contexto semântico usando um par de imagens estĂ©reo, a qual Ă© vinculada a uma grade de ocupação evidencial que modela as incertezas de um ambiente urbano desconhecido, aplicando a teoria de Dempster-Shafer. Para a tomada de decisĂŁo no planejamento do caminho, aplica-se a abordagem dos tentáculos virtuais para gerar possĂveis caminhos a partir do centro de referencia do veĂculo e com base nisto, duas novas estratĂ©gias sĂŁo propostas. Em primeiro, uma nova estratĂ©gia para escolher o caminho correto para melhor evitar obstáculos e seguir a tarefa local no contexto da navegação hibrida e, em segundo, um novo controle de malha fechada baseado na odometria visual e o tentáculo virtual Ă© modelado para execução do seguimento de caminho. Finalmente, um completo sistema automotivo integrando os modelos de percepção, planejamento e controle sĂŁo implementados e validados experimentalmente em condições reais usando um veĂculo autĂ´nomo experimental, onde os resultados mostram que a abordagem desenvolvida realiza com sucesso uma segura navegação local com base em sensores de câmeraAbstract: The development of autonomous vehicles capable of getting around on urban roads can provide important benefits in reducing accidents, in increasing life comfort and also in providing cost savings. Intelligent vehicles for example often base their decisions on observations obtained from various sensors such as LIDAR, GPS and Cameras. Actually, camera sensors have been receiving large attention due to they are cheap, easy to employ and provide rich data information. Inner-city environments represent an interesting but also very challenging scenario in this context, where the road layout may be very complex, the presence of objects such as trees, bicycles, cars might generate partial observations and also these observations are often noisy or even missing due to heavy occlusions. Thus, perception process by nature needs to be able to deal with uncertainties in the knowledge of the world around the car. While highway navigation and autonomous driving using a prior knowledge of the environment have been demonstrating successfully, understanding and navigating general inner-city scenarios with little prior knowledge remains an unsolved problem. In this thesis, this perception problem is analyzed for driving in the inner-city environments associated with the capacity to perform a safe displacement based on decision-making process in autonomous navigation. It is designed a perception system that allows robotic-cars to drive autonomously on roads, without the need to adapt the infrastructure, without requiring previous knowledge of the environment and considering the presence of dynamic objects such as cars. It is proposed a novel method based on machine learning to extract the semantic context using a pair of stereo images, which is merged in an evidential grid to model the uncertainties of an unknown urban environment, applying the Dempster-Shafer theory. To make decisions in path-planning, it is applied the virtual tentacle approach to generate possible paths starting from ego-referenced car and based on it, two news strategies are proposed. First one, a new strategy to select the correct path to better avoid obstacles and to follow the local task in the context of hybrid navigation, and second, a new closed loop control based on visual odometry and virtual tentacle is modeled to path-following execution. Finally, a complete automotive system integrating the perception, path-planning and control modules are implemented and experimentally validated in real situations using an experimental autonomous car, where the results show that the developed approach successfully performs a safe local navigation based on camera sensorsDoutoradoMecanica dos SĂłlidos e Projeto MecanicoDoutor em Engenharia Mecânic
Semantic evidential grid mapping using monocular and stereo cameras
Accurately estimating the current state of local traffic scenes is one of the key problems in the development of software components for automated vehicles. In addition to details on free space and drivability, static and dynamic traffic participants and information on the semantics may also be included in the desired representation. Multi-layer grid maps allow the inclusion of all of this information in a common representation. However, most existing grid mapping approaches only process range sensor measurements such as Lidar and Radar and solely model occupancy without semantic states. In order to add sensor redundancy and diversity, it is desired to add vision-based sensor setups in a common grid map representation. In this work, we present a semantic evidential grid mapping pipeline, including estimates for eight semantic classes, that is designed for straightforward fusion with range sensor data. Unlike other publications, our representation explicitly models uncertainties in the evidential model. We present results of our grid mapping pipeline based on a monocular vision setup and a stereo vision setup. Our mapping results are accurate and dense mapping due to the incorporation of a disparity- or depth-based ground surface estimation in the inverse perspective mapping. We conclude this paper by providing a detailed quantitative evaluation based on real traffic scenarios in the KITTI odometry benchmark dataset and demonstrating the advantages compared to other semantic grid mapping approaches
Exploitation des données cartographiques pour la perception de véhicules intelligents
This thesis is situated in the domains of robotics and data fusion, and concerns geographic information systems. We study the utility of adding digital maps, which model the urban environment in which the vehicle evolves, as a virtual sensor improving the perception results. Indeed, the maps contain a phenomenal quantity of information about the environment : its geometry, topology and additional contextual information. In this work, we extract road surface geometry and building models in order to deduce the context and the characteristics of each detected object. Our method is based on an extension of occupancy grids : the evidential perception grids. It permits to model explicitly the uncertainty related to the map and sensor data. By this means, the approach presents also the advantage of representing homogeneously the data originating from various sources : lidar, camera or maps. The maps are handled on equal terms with the physical sensors. This approach allows us to add geographic information without imputing unduly importance to it, which is essential in presence of errors. In our approach, the information fusion result, stored in a perception grid, is used to predict the stateof environment on the next instant. The fact of estimating the characteristics of dynamic elements does not satisfy the hypothesis of static world. Therefore, it is necessary to adjust the level of certainty attributed to these pieces of information. We do so by applying the temporal discounting. Due to the fact that existing methods are not well suited for this application, we propose a family of discoun toperators that take into account the type of handled information. The studied algorithms have been validated through tests on real data. We have thus developed the prototypes in Matlab and the C++ software based on Pacpus framework. Thanks to them, we present the results of experiments performed in real conditions.La plupart des logiciels contrôlant les véhicules intelligents traite de la compréhension de la scène. De nombreuses méthodes existent actuellement pour percevoir les obstacles de façon automatique. La majorité d’entre elles emploie ainsi les capteurs extéroceptifs comme des caméras ou des lidars. Cette thèse porte sur les domaines de la robotique et de la fusion d’information et s’intéresse aux systèmes d’information géographique. Nous étudions ainsi l’utilité d’ajouter des cartes numériques, qui cartographient le milieu urbain dans lequel évolue le véhicule, en tant que capteur virtuel améliorant les résultats de perception. Les cartes contiennent en effet une quantité phénoménale d’information sur l’environnement : sa géométrie, sa topologie ainsi que d’autres informations contextuelles. Dans nos travaux, nous avons extrait la géométrie des routes et des modèles de bâtiments afin de déduire le contexte et les caractéristiques de chaque objet détecté. Notre méthode se base sur une extension de grilles d’occupations : les grilles de perception crédibilistes. Elle permet de modéliser explicitement les incertitudes liées aux données de cartes et de capteurs. Elle présente également l’avantage de représenter de façon uniforme les données provenant de différentes sources : lidar, caméra ou cartes. Les cartes sont traitées de la même façon que les capteurs physiques. Cette démarche permet d’ajouter les informations géographiques sans pour autant leur donner trop d’importance, ce qui est essentiel en présence d’erreurs. Dans notre approche, le résultat de la fusion d’information contenu dans une grille de perception est utilisé pour prédire l’état de l’environnement à l’instant suivant. Le fait d’estimer les caractéristiques des éléments dynamiques ne satisfait donc plus l’hypothèse du monde statique. Par conséquent, il est nécessaire d’ajuster le niveau de certitude attribué à ces informations. Nous y parvenons en appliquant l’affaiblissement temporel. Étant donné que les méthodes existantes n’étaient pas adaptées à cette application, nous proposons une famille d’opérateurs d’affaiblissement prenant en compte le type d’information traitée. Les algorithmes étudiés ont été validés par des tests sur des données réelles. Nous avons donc développé des prototypes en Matlab et des logiciels en C++ basés sur la plate-forme Pacpus. Grâce à eux nous présentons les résultats des expériences effectués en conditions réelles
Traffic Scene Perception for Automated Driving with Top-View Grid Maps
Ein automatisiertes Fahrzeug muss sichere, sinnvolle und schnelle Entscheidungen auf Basis seiner Umgebung treffen.
Dies benötigt ein genaues und recheneffizientes Modell der Verkehrsumgebung.
Mit diesem Umfeldmodell sollen Messungen verschiedener Sensoren fusioniert, gefiltert und nachfolgenden Teilsysteme als kompakte, aber aussagekräftige Information bereitgestellt werden.
Diese Arbeit befasst sich mit der Modellierung der Verkehrsszene auf Basis von Top-View Grid Maps.
Im Vergleich zu anderen Umfeldmodellen ermöglichen sie eine frühe Fusion von Distanzmessungen aus verschiedenen Quellen mit geringem Rechenaufwand sowie eine explizite Modellierung von Freiraum.
Nach der Vorstellung eines Verfahrens zur Bodenoberflächenschätzung, das die Grundlage der Top-View Modellierung darstellt, werden Methoden zur Belegungs- und Elevationskartierung für Grid Maps auf Basis von mehreren, verrauschten, teilweise widersprüchlichen oder fehlenden Distanzmessungen behandelt.
Auf der resultierenden, sensorunabhängigen Repräsentation werden anschließend Modelle zur Detektion von Verkehrsteilnehmern sowie zur Schätzung von Szenenfluss, Odometrie und Tracking-Merkmalen untersucht.
Untersuchungen auf öffentlich verfügbaren Datensätzen und einem Realfahrzeug zeigen, dass Top-View Grid Maps durch on-board LiDAR Sensorik geschätzt und verlässlich sicherheitskritische Umgebungsinformationen wie Beobachtbarkeit und Befahrbarkeit abgeleitet werden können.
Schließlich werden Verkehrsteilnehmer als orientierte Bounding Boxen mit semantischen Klassen, Geschwindigkeiten und Tracking-Merkmalen aus einem gemeinsamen Modell zur Objektdetektion und Flussschätzung auf Basis der Top-View Grid Maps bestimmt
Scene Informer: Anchor-based Occlusion Inference and Trajectory Prediction in Partially Observable Environments
Navigating complex and dynamic environments requires autonomous vehicles
(AVs) to reason about both visible and occluded regions. This involves
predicting the future motion of observed agents, inferring occluded ones, and
modeling their interactions based on vectorized scene representations of the
partially observable environment. However, prior work on occlusion inference
and trajectory prediction have developed in isolation, with the former based on
simplified rasterized methods and the latter assuming full environment
observability. We introduce the Scene Informer, a unified approach for
predicting both observed agent trajectories and inferring occlusions in a
partially observable setting. It uses a transformer to aggregate various input
modalities and facilitate selective queries on occlusions that might intersect
with the AV's planned path. The framework estimates occupancy probabilities and
likely trajectories for occlusions, as well as forecast motion for observed
agents. We explore common observability assumptions in both domains and their
performance impact. Our approach outperforms existing methods in both occupancy
prediction and trajectory prediction in partially observable setting on the
Waymo Open Motion Dataset
- …