Search CORE

4,738 research outputs found

Assessing Transferability from Simulation to Reality for Reinforcement Learning

Author: Gienger Michael
Muratore Fabio
Peters Jan
Publication venue
Publication date: 01/01/2019
Field of study

Learning robot control policies from physics simulations is of great interest to the robotics community as it may render the learning process faster, cheaper, and safer by alleviating the need for expensive real-world experiments. However, the direct transfer of learned behavior from simulation to reality is a major challenge. Optimizing a policy on a slightly faulty simulator can easily lead to the maximization of the `Simulation Optimization Bias` (SOB). In this case, the optimizer exploits modeling errors of the simulator such that the resulting behavior can potentially damage the robot. We tackle this challenge by applying domain randomization, i.e., randomizing the parameters of the physics simulations during learning. We propose an algorithm called Simulation-based Policy Optimization with Transferability Assessment (SPOTA) which uses an estimator of the SOB to formulate a stopping criterion for training. The introduced estimator quantifies the over-fitting to the set of domains experienced while training. Our experimental results on two different second order nonlinear systems show that the new simulation-based policy search algorithm is able to learn a control policy exclusively from a randomized simulator, which can be applied directly to real systems without any additional training

arXiv.org e-Print Archive

Crossref

Teacher-Student Reinforcement Learning for Mapless Navigation using a Planetary Space Rover

Author: Benedicto Laia Vives
Burg Lionel
Bøgh Simon
Madsen Mads Rossen
Mortensen Anton Bjørndahl
Pedersen Emil Tribler
Publication venue
Publication date: 22/09/2023
Field of study

We address the challenge of enhancing navigation autonomy for planetary space rovers using reinforcement learning (RL). The ambition of future space missions necessitates advanced autonomous navigation capabilities for rovers to meet mission objectives. RL's potential in robotic autonomy is evident, but its reliance on simulations poses a challenge. Transferring policies to real-world scenarios often encounters the "reality gap", disrupting the transition from virtual to physical environments. The reality gap is exacerbated in the context of mapless navigation on Mars and Moon-like terrains, where unpredictable terrains and environmental factors play a significant role. Effective navigation requires a method attuned to these complexities and real-world data noise. We introduce a novel two-stage RL approach using offline noisy data. Our approach employs a teacher-student policy learning paradigm, inspired by the "learning by cheating" method. The teacher policy is trained in simulation. Subsequently, the student policy is trained on noisy data, aiming to mimic the teacher's behaviors while being more robust to real-world uncertainties. Our policies are transferred to a custom-designed rover for real-world testing. Comparative analyses between the teacher and student policies reveal that our approach offers improved behavioral performance, heightened noise resilience, and more effective sim-to-real transfer

arXiv.org e-Print Archive

AutoVRL: A High Fidelity Autonomous Ground Vehicle Simulator for Sim-to-Real Deep Reinforcement Learning

Author: Eskandarian Azim
Khairnar Apoorva
Sivashangaran Shathushan
Publication venue
Publication date: 01/09/2023
Field of study

Deep Reinforcement Learning (DRL) enables cognitive Autonomous Ground Vehicle (AGV) navigation utilizing raw sensor data without a-priori maps or GPS, which is a necessity in hazardous, information poor environments such as regions where natural disasters occur, and extraterrestrial planets. The substantial training time required to learn an optimal DRL policy, which can be days or weeks for complex tasks, is a major hurdle to real-world implementation in AGV applications. Training entails repeated collisions with the surrounding environment over an extended time period, dependent on the complexity of the task, to reinforce positive exploratory, application specific behavior that is expensive, and time consuming in the real-world. Effectively bridging the simulation to real-world gap is a requisite for successful implementation of DRL in complex AGV applications, enabling learning of cost-effective policies. We present AutoVRL, an open-source high fidelity simulator built upon the Bullet physics engine utilizing OpenAI Gym and Stable Baselines3 in PyTorch to train AGV DRL agents for sim-to-real policy transfer. AutoVRL is equipped with sensor implementations of GPS, IMU, LiDAR and camera, actuators for AGV control, and realistic environments, with extensibility for new environments and AGV models. The simulator provides access to state-of-the-art DRL algorithms, utilizing a python interface for simple algorithm and environment customization, and simulation execution.Comment:

\copyright

2023 the authors. This work has been accepted to IFAC for publication under a Creative Commons License CC-BY-NC-N

arXiv.org e-Print Archive

Sim-to-real transfer and reality gap modeling in model predictive control for autonomous driving

Author: Benderius Ola
Daza Iv\ue1n Garc\ueda
Izquierdo Rub\ue9n
Llorca David Fern\ue1ndez
Mart\uednez Luis Miguel
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2022
Field of study

The main challenge for the adoption of autonomous driving is to ensure an adequate level of safety. Considering the almost infinite variability of possible scenarios that autonomous vehicles would have to face, the use of autonomous driving simulators is becoming of utmost importance. Simulation suites allow the used of automated validation techniques in a wide variety of scenarios, and enable the development of closed-loop validation methods, such as machine learning and reinforcement learning approaches. However, simulation tools suffer from a standing flaw in that there is a noticeable gap between the simulation conditions and real-world scenarios. Although the use of simulators powers most of the research around autonomous driving, and is generally used within all domains it is divided into, there is an inherent source of error given the stochastic nature of activities performed in real world, which are unreplicable in computer environments. This paper proposes a new approach to assess the real-to-sim gap for path tracking systems. The aim is to narrow down the sources of error between simulation results and real-world conditions, and to evaluate the performance of the simulation suite in the design process by employing the information extracted from gap analysis, which adds a new dimension of development against other approaches for autonomous driving. A real-time model predictive controller (MPC) based on adaptive potential fields was developed and validated using the CARLA simulator. Both the path planning and vehicle control systems where tested in real traffic conditions. The error between the simulator and the real data acquisition was evaluated using the Pearson correlation coefficient (PCC) and the max normalized cross-correlation (MNCC). The controller was further evaluated on a process of sim-to-real transfer, and was finally tested both in simulation and real traffic conditions. A comparison was performed against an optimal-control ILQR-based model predictive controller was carried out to further showcase the validity of this approach

Chalmers Research

Recommended from our members

State-of-the-art on research and applications of machine learning in the building life cycle

Author: Hong T
Luo X
Wang Z
Zhang W
Publication venue: eScholarship, University of California
Publication date: 01/04/2020
Field of study

Fueled by big data, powerful and affordable computing resources, and advanced algorithms, machine learning has been explored and applied to buildings research for the past decades and has demonstrated its potential to enhance building performance. This study systematically surveyed how machine learning has been applied at different stages of building life cycle. By conducting a literature search on the Web of Knowledge platform, we found 9579 papers in this field and selected 153 papers for an in-depth review. The number of published papers is increasing year by year, with a focus on building design, operation, and control. However, no study was found using machine learning in building commissioning. There are successful pilot studies on fault detection and diagnosis of HVAC equipment and systems, load prediction, energy baseline estimate, load shape clustering, occupancy prediction, and learning occupant behaviors and energy use patterns. None of the existing studies were adopted broadly by the building industry, due to common challenges including (1) lack of large scale labeled data to train and validate the model, (2) lack of model transferability, which limits a model trained with one data-rich building to be used in another building with limited data, (3) lack of strong justification of costs and benefits of deploying machine learning, and (4) the performance might not be reliable and robust for the stated goals, as the method might work for some buildings but could not be generalized to others. Findings from the study can inform future machine learning research to improve occupant comfort, energy efficiency, demand flexibility, and resilience of buildings, as well as to inspire young researchers in the field to explore multidisciplinary approaches that integrate building science, computing science, data science, and social science

eScholarship - University of California

On quantifying the value of simulation for training and evaluating robotic agents

Author: Courchesne Anthony
Publication venue
Publication date: 01/04/2021
Field of study

Un problème récurrent dans le domaine de la robotique est la difficulté à reproduire les résultats et valider les affirmations faites par les scientifiques. Les expériences conduites en laboratoire donnent fréquemment des résultats propres à l'environnement dans lequel elles ont été effectuées, rendant la tâche de les reproduire et de les valider ardues et coûteuses. Pour cette raison, il est difficile de comparer la performance et la robustesse de différents contrôleurs robotiques. Les environnements substituts à faibles coûts sont populaires, mais introduisent une réduction de performance lorsque l'environnement cible est enfin utilisé. Ce mémoire présente nos travaux sur l'amélioration des références et de la comparaison d'algorithmes (``Benchmarking'') en robotique, notamment dans le domaine de la conduite autonome. Nous présentons une nouvelle platforme, les Autolabs Duckietown, qui permet aux chercheurs d'évaluer des algorithmes de conduite autonome sur des tâches, du matériel et un environnement standardisé à faible coût. La plateforme offre également un environnement virtuel afin d'avoir facilement accès à une quantité illimitée de données annotées. Nous utilisons la plateforme pour analyser les différences entre la simulation et la réalité en ce qui concerne la prédictivité de la simulation ainsi que la qualité des images générées. Nous fournissons deux métriques pour quantifier l'utilité d'une simulation et nous démontrons de quelles façons elles peuvent être utilisées afin d'optimiser un environnement proxy.A common problem in robotics is reproducing results and claims made by researchers. The experiments done in robotics laboratories typically yield results that are specific to a complex setup and difficult or costly to reproduce and validate in other contexts. For this reason, it is arduous to compare the performance and robustness of various robotic controllers. Low-cost reproductions of physical environments are popular but induce a performance reduction when transferred to the target domain. This thesis present the results of our work toward improving benchmarking in robotics, specifically for autonomous driving. We build a new platform, the Duckietown Autolabs, which allow researchers to evaluate autonomous driving algorithms in a standardized framework on low-cost hardware. The platform offers a simulated environment for easy access to annotated data and parallel evaluation of driving solutions in customizable environments. We use the platform to analyze the discrepancy between simulation and reality in the case of predictivity and quality of data generated. We supply two metrics to quantify the usefulness of a simulation and demonstrate how they can be used to optimize the value of a proxy environment

Dépôt Institutionnel Numérique

Fuzzy Ensembles of Reinforcement Learning Policies for Robotic Systems with Varied Parameters

Author: Boiko Igor
Haddad Abdel Gafoor
Mohiuddin Mohammed B.
Zweiri Yahya
Publication venue
Publication date: 08/11/2023
Field of study

Reinforcement Learning (RL) is an emerging approach to control many dynamical systems for which classical control approaches are not applicable or insufficient. However, the resultant policies may not generalize to variations in the parameters that the system may exhibit. This paper presents a powerful yet simple algorithm in which collaboration is facilitated between RL agents that are trained independently to perform the same task but with different system parameters. The independency among agents allows the exploitation of multi-core processing to perform parallel training. Two examples are provided to demonstrate the effectiveness of the proposed technique. The main demonstration is performed on a quadrotor with slung load tracking problem in a real-time experimental setup. It is shown that integrating the developed algorithm outperforms individual policies by reducing the RMSE tracking error. The robustness of the ensemble is also verified against wind disturbance.Comment: arXiv admin note: text overlap with arXiv:2311.0501

arXiv.org e-Print Archive