1,936 research outputs found

    RobustStateNet: Robust ego vehicle state estimation for Autonomous Driving

    Get PDF
    Control of an ego vehicle for Autonomous Driving (AD) requires an accurate definition of its state. Implementation of various model-based Kalman Filtering (KF) techniques for state estimation is prevalent in the literature. These algorithms use measurements from IMU and input signals from steering and wheel encoders for motion prediction with physics-based models, and a Global Navigation Satellite System(GNSS) for global localization. Such methods are widely investigated and majorly focus on increasing the accuracy of the estimation. Ego motion prediction in these approaches does not model the sensor failure modes and assumes completely known dynamics with motion and measurement model noises. In this work, we propose a novel Recurrent Neural Network (RNN) based motion predictor that parallelly models the sensor measurement dynamics and selectively fuses the features to increase the robustness of prediction, in particular in scenarios where we witness sensor failures. This motion predictor is integrated into a KF-like framework, RobustStateNet that takes a global position from the GNSS sensor and updates the predicted state. We demonstrate that the proposed state estimation routine outperforms the Model-Based KF and KalmanNet architecture in terms of estimation accuracy and robustness. The proposed algorithms are validated in the modified NuScenes CAN bus dataset, designed to simulate various types of sensor failures

    A Survey on Imitation Learning Techniques for End-to-End Autonomous Vehicles

    Get PDF
    Funding Agency: 10.13039/100016335-Jaguar Land Rover 10.13039/501100000266-U.K. Engineering and Physical Sciences Research Council (EPSRC) (Grant Number: EP/N01300X/1) jointly funded Towards Autonomy: Smart and Connected Control (TASCC) ProgramPeer reviewedPostprin

    On the path integration system of insects: there and back again

    Get PDF
    Navigation is an essential capability of animate organisms and robots. Among animate organisms of particular interest are insects because they are capable of a variety of navigation competencies solving challenging problems with limited resources, thereby providing inspiration for robot navigation. Ants, bees and other insects are able to return to their nest using a navigation strategy known as path integration. During path integration, the animal maintains a running estimate of the distance and direction to its nest as it travels. This estimate, known as the `home vector', enables the animal to return to its nest. Path integration was the technique used by sea navigators to cross the open seas in the past. To perform path integration, both sailors and insects need access to two pieces of information, their direction and their speed of motion over time. Neurons encoding the heading and speed have been found to converge on a highly conserved region of the insect brain, the central complex. It is, therefore, believed that the central complex is key to the computations pertaining to path integration. However, several questions remain about the exact structure of the neuronal circuit that tracks the animal's heading, how it differs between insect species, and how the speed and direction are integrated into a home vector and maintained in memory. In this thesis, I have combined behavioural, anatomical, and physiological data with computational modelling and agent simulations to tackle these questions. Analysis of the internal compass circuit of two insect species with highly divergent ecologies, the fruit fly Drosophila melanogaster and the desert locust Schistocerca gregaria, revealed that despite 400 million years of evolutionary divergence, both species share a fundamentally common internal compass circuit that keeps track of the animal's heading. However, subtle differences in the neuronal morphologies result in distinct circuit dynamics adapted to the ecology of each species, thereby providing insights into how neural circuits evolved to accommodate species-specific behaviours. The fast-moving insects need to update their home vector memory continuously as they move, yet they can remember it for several hours. This conjunction of fast updating and long persistence of the home vector does not directly map to current short, mid, and long-term memory accounts. An extensive literature review revealed a lack of available memory models that could support the home vector memory requirements. A comparison of existing behavioural data with the homing behaviour of simulated robot agents illustrated that the prevalent hypothesis, which posits that the neural substrate of the path integration memory is a bump attractor network, is contradicted by behavioural evidence. An investigation of the type of memory utilised during path integration revealed that cold-induced anaesthesia disrupts the ability of ants to return to their nest, but it does not eliminate their ability to move in the correct homing direction. Using computational modelling and simulated agents, I argue that the best explanation for this phenomenon is not two separate memories differently affected by temperature but a shared memory that encodes both the direction and distance. The results presented in this thesis shed some more light on the labyrinth that researchers of animal navigation have been exploring in their attempts to unravel a few more rounds of Ariadne's thread back to its origin. The findings provide valuable insights into the path integration system of insects and inspiration for future memory research, advancing path integration techniques in robotics, and developing novel neuromorphic solutions to computational problems

    A Benchmark Comparison of Visual Place Recognition Techniques for Resource-Constrained Embedded Platforms

    Get PDF
    Autonomous navigation has become a widely researched area of expertise over the past few years, gaining a massive following due to its necessity in creating a fully autonomous robotic system. Autonomous navigation is an exceedingly difficult task to accomplish in and of itself. Successful navigation relies heavily on the ability to self-localise oneself within a given environment. Without this awareness of one’s own location, it is impossible to successfully navigate in an autonomous manner. Since its inception Simultaneous Localization and Mapping (SLAM) has become one of the most widely researched areas of autonomous navigation. SLAM focuses on self-localization within a mapped or un-mapped environment, and constructing or updating the map of one’s surroundings. Visual Place Recognition (VPR) is an essential part of any SLAM system. VPR relies on visual cues to determine one’s location within a mapped environment. This thesis presents two main topics within the field of VPR. First, this thesis presents a benchmark analysis of several popular embedded platforms when performing VPR. The presented benchmark analyses six different VPR techniques across three different datasets, and investigates accuracy, CPU usage, memory usage, processing time and power consumption. The benchmark demonstrated a clear relationship between platform architecture and the metrics measured, with platforms of the same architecture achieving comparable accuracy and algorithm efficiency. Additionally, the Raspberry Pi platform was noted as a standout in terms of algorithm efficiency and power consumption. Secondly, this thesis proposes an evaluation framework intended to provide information about a VPR technique’s useability within a real-time application. The approach makes use of the incoming frame rate of an image stream and the VPR frame rate, the rate at which the technique can perform VPR, to determine how efficient VPR techniques would be in a real-time environment. This evaluation framework determined that CoHOG would be the most effective algorithm to be deployed in a real-time environment as it had the best ratio between computation time and accuracy

    CAROM Air -- Vehicle Localization and Traffic Scene Reconstruction from Aerial Videos

    Full text link
    Road traffic scene reconstruction from videos has been desirable by road safety regulators, city planners, researchers, and autonomous driving technology developers. However, it is expensive and unnecessary to cover every mile of the road with cameras mounted on the road infrastructure. This paper presents a method that can process aerial videos to vehicle trajectory data so that a traffic scene can be automatically reconstructed and accurately re-simulated using computers. On average, the vehicle localization error is about 0.1 m to 0.3 m using a consumer-grade drone flying at 120 meters. This project also compiles a dataset of 50 reconstructed road traffic scenes from about 100 hours of aerial videos to enable various downstream traffic analysis applications and facilitate further road traffic related research. The dataset is available at https://github.com/duolu/CAROM.Comment: Accepted to IEEE ICRA 202

    AI-Generated Incentive Mechanism and Full-Duplex Semantic Communications for Information Sharing

    Full text link
    The next generation of Internet services, such as Metaverse, rely on mixed reality (MR) technology to provide immersive user experiences. However, the limited computation power of MR headset-mounted devices (HMDs) hinders the deployment of such services. Therefore, we propose an efficient information sharing scheme based on full-duplex device-to-device (D2D) semantic communications to address this issue. Our approach enables users to avoid heavy and repetitive computational tasks, such as artificial intelligence-generated content (AIGC) in the view images of all MR users. Specifically, a user can transmit the generated content and semantic information extracted from their view image to nearby users, who can then use this information to obtain the spatial matching of computation results under their view images. We analyze the performance of full-duplex D2D communications, including the achievable rate and bit error probability, by using generalized small-scale fading models. To facilitate semantic information sharing among users, we design a contract theoretic AI-generated incentive mechanism. The proposed diffusion model generates the optimal contract design, outperforming two deep reinforcement learning algorithms, i.e., proximal policy optimization and soft actor-critic algorithms. Our numerical analysis experiment proves the effectiveness of our proposed methods. The code for this paper is available at https://github.com/HongyangDu/SemSharingComment: Accepted by IEEE JSA

    Challenges for Monocular 6D Object Pose Estimation in Robotics

    Full text link
    Object pose estimation is a core perception task that enables, for example, object grasping and scene understanding. The widely available, inexpensive and high-resolution RGB sensors and CNNs that allow for fast inference based on this modality make monocular approaches especially well suited for robotics applications. We observe that previous surveys on object pose estimation establish the state of the art for varying modalities, single- and multi-view settings, and datasets and metrics that consider a multitude of applications. We argue, however, that those works' broad scope hinders the identification of open challenges that are specific to monocular approaches and the derivation of promising future challenges for their application in robotics. By providing a unified view on recent publications from both robotics and computer vision, we find that occlusion handling, novel pose representations, and formalizing and improving category-level pose estimation are still fundamental challenges that are highly relevant for robotics. Moreover, to further improve robotic performance, large object sets, novel objects, refractive materials, and uncertainty estimates are central, largely unsolved open challenges. In order to address them, ontological reasoning, deformability handling, scene-level reasoning, realistic datasets, and the ecological footprint of algorithms need to be improved.Comment: arXiv admin note: substantial text overlap with arXiv:2302.1182

    Motion-Bias-Free Feature-Based SLAM

    Full text link
    For SLAM to be safely deployed in unstructured real world environments, it must possess several key properties that are not encompassed by conventional benchmarks. In this paper we show that SLAM commutativity, that is, consistency in trajectory estimates on forward and reverse traverses of the same route, is a significant issue for the state of the art. Current pipelines show a significant bias between forward and reverse directions of travel, that is in addition inconsistent regarding which direction of travel exhibits better performance. In this paper we propose several contributions to feature-based SLAM pipelines that remedies the motion bias problem. In a comprehensive evaluation across four datasets, we show that our contributions implemented in ORB-SLAM2 substantially reduce the bias between forward and backward motion and additionally improve the aggregated trajectory error. Removing the SLAM motion bias has significant relevance for the wide range of robotics and computer vision applications where performance consistency is important.Comment: BMVC 202

    Contributions to improve the technologies supporting unmanned aircraft operations

    Get PDF
    Mención Internacional en el título de doctorUnmanned Aerial Vehicles (UAVs), in their smaller versions known as drones, are becoming increasingly important in today's societies. The systems that make them up present a multitude of challenges, of which error can be considered the common denominator. The perception of the environment is measured by sensors that have errors, the models that interpret the information and/or define behaviors are approximations of the world and therefore also have errors. Explaining error allows extending the limits of deterministic models to address real-world problems. The performance of the technologies embedded in drones depends on our ability to understand, model, and control the error of the systems that integrate them, as well as new technologies that may emerge. Flight controllers integrate various subsystems that are generally dependent on other systems. One example is the guidance systems. These systems provide the engine's propulsion controller with the necessary information to accomplish a desired mission. For this purpose, the flight controller is made up of a control law for the guidance system that reacts to the information perceived by the perception and navigation systems. The error of any of the subsystems propagates through the ecosystem of the controller, so the study of each of them is essential. On the other hand, among the strategies for error control are state-space estimators, where the Kalman filter has been a great ally of engineers since its appearance in the 1960s. Kalman filters are at the heart of information fusion systems, minimizing the error covariance of the system and allowing the measured states to be filtered and estimated in the absence of observations. State Space Models (SSM) are developed based on a set of hypotheses for modeling the world. Among the assumptions are that the models of the world must be linear, Markovian, and that the error of their models must be Gaussian. In general, systems are not linear, so linearization are performed on models that are already approximations of the world. In other cases, the noise to be controlled is not Gaussian, but it is approximated to that distribution in order to be able to deal with it. On the other hand, many systems are not Markovian, i.e., their states do not depend only on the previous state, but there are other dependencies that state space models cannot handle. This thesis deals a collection of studies in which error is formulated and reduced. First, the error in a computer vision-based precision landing system is studied, then estimation and filtering problems from the deep learning approach are addressed. Finally, classification concepts with deep learning over trajectories are studied. The first case of the collection xviiistudies the consequences of error propagation in a machine vision-based precision landing system. This paper proposes a set of strategies to reduce the impact on the guidance system, and ultimately reduce the error. The next two studies approach the estimation and filtering problem from the deep learning approach, where error is a function to be minimized by learning. The last case of the collection deals with a trajectory classification problem with real data. This work completes the two main fields in deep learning, regression and classification, where the error is considered as a probability function of class membership.Los vehículos aéreos no tripulados (UAV) en sus versiones de pequeño tamaño conocidos como drones, van tomando protagonismo en las sociedades actuales. Los sistemas que los componen presentan multitud de retos entre los cuales el error se puede considerar como el denominador común. La percepción del entorno se mide mediante sensores que tienen error, los modelos que interpretan la información y/o definen comportamientos son aproximaciones del mundo y por consiguiente también presentan error. Explicar el error permite extender los límites de los modelos deterministas para abordar problemas del mundo real. El rendimiento de las tecnologías embarcadas en los drones, dependen de nuestra capacidad de comprender, modelar y controlar el error de los sistemas que los integran, así como de las nuevas tecnologías que puedan surgir. Los controladores de vuelo integran diferentes subsistemas los cuales generalmente son dependientes de otros sistemas. Un caso de esta situación son los sistemas de guiado. Estos sistemas son los encargados de proporcionar al controlador de los motores información necesaria para cumplir con una misión deseada. Para ello se componen de una ley de control de guiado que reacciona a la información percibida por los sistemas de percepción y navegación. El error de cualquiera de estos sistemas se propaga por el ecosistema del controlador siendo vital su estudio. Por otro lado, entre las estrategias para abordar el control del error se encuentran los estimadores en espacios de estados, donde el filtro de Kalman desde su aparición en los años 60, ha sido y continúa siendo un gran aliado para los ingenieros. Los filtros de Kalman son el corazón de los sistemas de fusión de información, los cuales minimizan la covarianza del error del sistema, permitiendo filtrar los estados medidos y estimarlos cuando no se tienen observaciones. Los modelos de espacios de estados se desarrollan en base a un conjunto de hipótesis para modelar el mundo. Entre las hipótesis se encuentra que los modelos del mundo han de ser lineales, markovianos y que el error de sus modelos ha de ser gaussiano. Generalmente los sistemas no son lineales por lo que se realizan linealizaciones sobre modelos que a su vez ya son aproximaciones del mundo. En otros casos el ruido que se desea controlar no es gaussiano, pero se aproxima a esta distribución para poder abordarlo. Por otro lado, multitud de sistemas no son markovianos, es decir, sus estados no solo dependen del estado anterior, sino que existen otras dependencias que los modelos de espacio de estados no son capaces de abordar. Esta tesis aborda un compendio de estudios sobre los que se formula y reduce el error. En primer lugar, se estudia el error en un sistema de aterrizaje de precisión basado en visión por computador. Después se plantean problemas de estimación y filtrado desde la aproximación del aprendizaje profundo. Por último, se estudian los conceptos de clasificación con aprendizaje profundo sobre trayectorias. El primer caso del compendio estudia las consecuencias de la propagación del error de un sistema de aterrizaje de precisión basado en visión artificial. En este trabajo se propone un conjunto de estrategias para reducir el impacto sobre el sistema de guiado, y en última instancia reducir el error. Los siguientes dos estudios abordan el problema de estimación y filtrado desde la perspectiva del aprendizaje profundo, donde el error es una función que minimizar mediante aprendizaje. El último caso del compendio aborda un problema de clasificación de trayectorias con datos reales. Con este trabajo se completan los dos campos principales en aprendizaje profundo, regresión y clasificación, donde se plantea el error como una función de probabilidad de pertenencia a una clase.I would like to thank the Ministry of Science and Innovation for granting me the funding with reference PRE2018-086793, associated to the project TEC2017-88048-C2-2-R, which provide me the opportunity to carry out all my PhD. activities, including completing an international research internship.Programa de Doctorado en Ciencia y Tecnología Informática por la Universidad Carlos III de MadridPresidente: Antonio Berlanga de Jesús.- Secretario: Daniel Arias Medina.- Vocal: Alejandro Martínez Cav

    Towards Object-Centric Scene Understanding

    Get PDF
    Visual perception for autonomous agents continues to attract community attention due to the disruptive technologies and the wide applicability of such solutions. Autonomous Driving (AD), a major application in this domain, promises to revolutionize our approach to mobility while bringing critical advantages in limiting accident fatalities. Fueled by recent advances in Deep Learning (DL), more computer vision tasks are being addressed using a learning paradigm. Deep Neural Networks (DNNs) succeeded consistently in pushing performances to unprecedented levels and demonstrating the ability of such approaches to generalize to an increasing number of difficult problems, such as 3D vision tasks. In this thesis, we address two main challenges arising from the current approaches. Namely, the computational complexity of multi-task pipelines, and the increasing need for manual annotations. On the one hand, AD systems need to perceive the surrounding environment on different levels of detail and, subsequently, take timely actions. This multitasking further limits the time available for each perception task. On the other hand, the need for universal generalization of such systems to massively diverse situations requires the use of large-scale datasets covering long-tailed cases. Such requirement renders the use of traditional supervised approaches, despite the data readily available in the AD domain, unsustainable in terms of annotation costs, especially for 3D tasks. Driven by the AD environment nature and the complexity dominated (unlike indoor scenes) by the presence of other scene elements (mainly cars and pedestrians) we focus on the above-mentioned challenges in object-centric tasks. We, then, situate our contributions appropriately in fast-paced literature, while supporting our claims with extensive experimental analysis leveraging up-to-date state-of-the-art results and community-adopted benchmarks
    corecore