139 research outputs found

    Percepção do ambiente urbano e navegação usando visão robótica : concepção e implementação aplicado à veículo autônomo

    Get PDF
    Orientadores: Janito Vaqueiro Ferreira, Alessandro Corrêa VictorinoTese (doutorado) - Universidade Estadual de Campinas, Faculdade de Engenharia MecânicaResumo: O desenvolvimento de veículos autônomos capazes de se locomover em ruas urbanas pode proporcionar importantes benefícios na redução de acidentes, no aumentando da qualidade de vida e também na redução de custos. Veículos inteligentes, por exemplo, frequentemente baseiam suas decisões em observações obtidas a partir de vários sensores tais como LIDAR, GPS e câmeras. Atualmente, sensores de câmera têm recebido grande atenção pelo motivo de que eles são de baixo custo, fáceis de utilizar e fornecem dados com rica informação. Ambientes urbanos representam um interessante mas também desafiador cenário neste contexto, onde o traçado das ruas podem ser muito complexos, a presença de objetos tais como árvores, bicicletas, veículos podem gerar observações parciais e também estas observações são muitas vezes ruidosas ou ainda perdidas devido a completas oclusões. Portanto, o processo de percepção por natureza precisa ser capaz de lidar com a incerteza no conhecimento do mundo em torno do veículo. Nesta tese, este problema de percepção é analisado para a condução nos ambientes urbanos associado com a capacidade de realizar um deslocamento seguro baseado no processo de tomada de decisão em navegação autônoma. Projeta-se um sistema de percepção que permita veículos robóticos a trafegar autonomamente nas ruas, sem a necessidade de adaptar a infraestrutura, sem o conhecimento prévio do ambiente e considerando a presença de objetos dinâmicos tais como veículos. Propõe-se um novo método baseado em aprendizado de máquina para extrair o contexto semântico usando um par de imagens estéreo, a qual é vinculada a uma grade de ocupação evidencial que modela as incertezas de um ambiente urbano desconhecido, aplicando a teoria de Dempster-Shafer. Para a tomada de decisão no planejamento do caminho, aplica-se a abordagem dos tentáculos virtuais para gerar possíveis caminhos a partir do centro de referencia do veículo e com base nisto, duas novas estratégias são propostas. Em primeiro, uma nova estratégia para escolher o caminho correto para melhor evitar obstáculos e seguir a tarefa local no contexto da navegação hibrida e, em segundo, um novo controle de malha fechada baseado na odometria visual e o tentáculo virtual é modelado para execução do seguimento de caminho. Finalmente, um completo sistema automotivo integrando os modelos de percepção, planejamento e controle são implementados e validados experimentalmente em condições reais usando um veículo autônomo experimental, onde os resultados mostram que a abordagem desenvolvida realiza com sucesso uma segura navegação local com base em sensores de câmeraAbstract: The development of autonomous vehicles capable of getting around on urban roads can provide important benefits in reducing accidents, in increasing life comfort and also in providing cost savings. Intelligent vehicles for example often base their decisions on observations obtained from various sensors such as LIDAR, GPS and Cameras. Actually, camera sensors have been receiving large attention due to they are cheap, easy to employ and provide rich data information. Inner-city environments represent an interesting but also very challenging scenario in this context, where the road layout may be very complex, the presence of objects such as trees, bicycles, cars might generate partial observations and also these observations are often noisy or even missing due to heavy occlusions. Thus, perception process by nature needs to be able to deal with uncertainties in the knowledge of the world around the car. While highway navigation and autonomous driving using a prior knowledge of the environment have been demonstrating successfully, understanding and navigating general inner-city scenarios with little prior knowledge remains an unsolved problem. In this thesis, this perception problem is analyzed for driving in the inner-city environments associated with the capacity to perform a safe displacement based on decision-making process in autonomous navigation. It is designed a perception system that allows robotic-cars to drive autonomously on roads, without the need to adapt the infrastructure, without requiring previous knowledge of the environment and considering the presence of dynamic objects such as cars. It is proposed a novel method based on machine learning to extract the semantic context using a pair of stereo images, which is merged in an evidential grid to model the uncertainties of an unknown urban environment, applying the Dempster-Shafer theory. To make decisions in path-planning, it is applied the virtual tentacle approach to generate possible paths starting from ego-referenced car and based on it, two news strategies are proposed. First one, a new strategy to select the correct path to better avoid obstacles and to follow the local task in the context of hybrid navigation, and second, a new closed loop control based on visual odometry and virtual tentacle is modeled to path-following execution. Finally, a complete automotive system integrating the perception, path-planning and control modules are implemented and experimentally validated in real situations using an experimental autonomous car, where the results show that the developed approach successfully performs a safe local navigation based on camera sensorsDoutoradoMecanica dos Sólidos e Projeto MecanicoDoutor em Engenharia Mecânic

    Semantic evidential grid mapping using monocular and stereo cameras

    Get PDF
    Accurately estimating the current state of local traffic scenes is one of the key problems in the development of software components for automated vehicles. In addition to details on free space and drivability, static and dynamic traffic participants and information on the semantics may also be included in the desired representation. Multi-layer grid maps allow the inclusion of all of this information in a common representation. However, most existing grid mapping approaches only process range sensor measurements such as Lidar and Radar and solely model occupancy without semantic states. In order to add sensor redundancy and diversity, it is desired to add vision-based sensor setups in a common grid map representation. In this work, we present a semantic evidential grid mapping pipeline, including estimates for eight semantic classes, that is designed for straightforward fusion with range sensor data. Unlike other publications, our representation explicitly models uncertainties in the evidential model. We present results of our grid mapping pipeline based on a monocular vision setup and a stereo vision setup. Our mapping results are accurate and dense mapping due to the incorporation of a disparity- or depth-based ground surface estimation in the inverse perspective mapping. We conclude this paper by providing a detailed quantitative evaluation based on real traffic scenarios in the KITTI odometry benchmark dataset and demonstrating the advantages compared to other semantic grid mapping approaches

    Exploitation des données cartographiques pour la perception de véhicules intelligents

    Get PDF
    This thesis is situated in the domains of robotics and data fusion, and concerns geographic information systems. We study the utility of adding digital maps, which model the urban environment in which the vehicle evolves, as a virtual sensor improving the perception results. Indeed, the maps contain a phenomenal quantity of information about the environment : its geometry, topology and additional contextual information. In this work, we extract road surface geometry and building models in order to deduce the context and the characteristics of each detected object. Our method is based on an extension of occupancy grids : the evidential perception grids. It permits to model explicitly the uncertainty related to the map and sensor data. By this means, the approach presents also the advantage of representing homogeneously the data originating from various sources : lidar, camera or maps. The maps are handled on equal terms with the physical sensors. This approach allows us to add geographic information without imputing unduly importance to it, which is essential in presence of errors. In our approach, the information fusion result, stored in a perception grid, is used to predict the stateof environment on the next instant. The fact of estimating the characteristics of dynamic elements does not satisfy the hypothesis of static world. Therefore, it is necessary to adjust the level of certainty attributed to these pieces of information. We do so by applying the temporal discounting. Due to the fact that existing methods are not well suited for this application, we propose a family of discoun toperators that take into account the type of handled information. The studied algorithms have been validated through tests on real data. We have thus developed the prototypes in Matlab and the C++ software based on Pacpus framework. Thanks to them, we present the results of experiments performed in real conditions.La plupart des logiciels contrôlant les véhicules intelligents traite de la compréhension de la scène. De nombreuses méthodes existent actuellement pour percevoir les obstacles de façon automatique. La majorité d’entre elles emploie ainsi les capteurs extéroceptifs comme des caméras ou des lidars. Cette thèse porte sur les domaines de la robotique et de la fusion d’information et s’intéresse aux systèmes d’information géographique. Nous étudions ainsi l’utilité d’ajouter des cartes numériques, qui cartographient le milieu urbain dans lequel évolue le véhicule, en tant que capteur virtuel améliorant les résultats de perception. Les cartes contiennent en effet une quantité phénoménale d’information sur l’environnement : sa géométrie, sa topologie ainsi que d’autres informations contextuelles. Dans nos travaux, nous avons extrait la géométrie des routes et des modèles de bâtiments afin de déduire le contexte et les caractéristiques de chaque objet détecté. Notre méthode se base sur une extension de grilles d’occupations : les grilles de perception crédibilistes. Elle permet de modéliser explicitement les incertitudes liées aux données de cartes et de capteurs. Elle présente également l’avantage de représenter de façon uniforme les données provenant de différentes sources : lidar, caméra ou cartes. Les cartes sont traitées de la même façon que les capteurs physiques. Cette démarche permet d’ajouter les informations géographiques sans pour autant leur donner trop d’importance, ce qui est essentiel en présence d’erreurs. Dans notre approche, le résultat de la fusion d’information contenu dans une grille de perception est utilisé pour prédire l’état de l’environnement à l’instant suivant. Le fait d’estimer les caractéristiques des éléments dynamiques ne satisfait donc plus l’hypothèse du monde statique. Par conséquent, il est nécessaire d’ajuster le niveau de certitude attribué à ces informations. Nous y parvenons en appliquant l’affaiblissement temporel. Étant donné que les méthodes existantes n’étaient pas adaptées à cette application, nous proposons une famille d’opérateurs d’affaiblissement prenant en compte le type d’information traitée. Les algorithmes étudiés ont été validés par des tests sur des données réelles. Nous avons donc développé des prototypes en Matlab et des logiciels en C++ basés sur la plate-forme Pacpus. Grâce à eux nous présentons les résultats des expériences effectués en conditions réelles

    Traffic Scene Perception for Automated Driving with Top-View Grid Maps

    Get PDF
    Ein automatisiertes Fahrzeug muss sichere, sinnvolle und schnelle Entscheidungen auf Basis seiner Umgebung treffen. Dies benötigt ein genaues und recheneffizientes Modell der Verkehrsumgebung. Mit diesem Umfeldmodell sollen Messungen verschiedener Sensoren fusioniert, gefiltert und nachfolgenden Teilsysteme als kompakte, aber aussagekräftige Information bereitgestellt werden. Diese Arbeit befasst sich mit der Modellierung der Verkehrsszene auf Basis von Top-View Grid Maps. Im Vergleich zu anderen Umfeldmodellen ermöglichen sie eine frühe Fusion von Distanzmessungen aus verschiedenen Quellen mit geringem Rechenaufwand sowie eine explizite Modellierung von Freiraum. Nach der Vorstellung eines Verfahrens zur Bodenoberflächenschätzung, das die Grundlage der Top-View Modellierung darstellt, werden Methoden zur Belegungs- und Elevationskartierung für Grid Maps auf Basis von mehreren, verrauschten, teilweise widersprüchlichen oder fehlenden Distanzmessungen behandelt. Auf der resultierenden, sensorunabhängigen Repräsentation werden anschließend Modelle zur Detektion von Verkehrsteilnehmern sowie zur Schätzung von Szenenfluss, Odometrie und Tracking-Merkmalen untersucht. Untersuchungen auf öffentlich verfügbaren Datensätzen und einem Realfahrzeug zeigen, dass Top-View Grid Maps durch on-board LiDAR Sensorik geschätzt und verlässlich sicherheitskritische Umgebungsinformationen wie Beobachtbarkeit und Befahrbarkeit abgeleitet werden können. Schließlich werden Verkehrsteilnehmer als orientierte Bounding Boxen mit semantischen Klassen, Geschwindigkeiten und Tracking-Merkmalen aus einem gemeinsamen Modell zur Objektdetektion und Flussschätzung auf Basis der Top-View Grid Maps bestimmt

    Scene Informer: Anchor-based Occlusion Inference and Trajectory Prediction in Partially Observable Environments

    Full text link
    Navigating complex and dynamic environments requires autonomous vehicles (AVs) to reason about both visible and occluded regions. This involves predicting the future motion of observed agents, inferring occluded ones, and modeling their interactions based on vectorized scene representations of the partially observable environment. However, prior work on occlusion inference and trajectory prediction have developed in isolation, with the former based on simplified rasterized methods and the latter assuming full environment observability. We introduce the Scene Informer, a unified approach for predicting both observed agent trajectories and inferring occlusions in a partially observable setting. It uses a transformer to aggregate various input modalities and facilitate selective queries on occlusions that might intersect with the AV's planned path. The framework estimates occupancy probabilities and likely trajectories for occlusions, as well as forecast motion for observed agents. We explore common observability assumptions in both domains and their performance impact. Our approach outperforms existing methods in both occupancy prediction and trajectory prediction in partially observable setting on the Waymo Open Motion Dataset
    • …
    corecore