Search CORE

25 research outputs found

Bound Controller for a Quadruped Robot using Pre-Fitting Deep Reinforcement Learning

Author: Li Anqiao
Wang Zhicheng
Wu Jun
Zhu Qiuguo
Publication venue
Publication date: 01/11/2020
Field of study

The bound gait is an important gait in quadruped robot locomotion. It can be used to cross obstacles and often serves as transition mode between trot and gallop. However, because of the complexity of the models, the bound gait built by the conventional control method is often unnatural and slow to compute. In the present work, we introduce a method to achieve the bound gait based on model-free pre-fit deep reinforcement learning (PF-DRL). We first constructed a net with the same structure as an actor net in the PPO2 and pre-fit it using the data collected from a robot using conventional model-based controller. Next, the trained weights are transferred into the PPO2 and be optimized further. Moreover, target on the symmetrical and periodic characteristic during bounding, we designed a reward function based on contact points. We also used feature engineering to improve the input features of the DRL model and improve performance on flat ground. Finally, we trained the bound controller in simulation and successfully deployed it on the Jueying Mini robot. It performs better than the conventional method with higher computational efficiency and more stable center-of-mass height in our experiments.Comment: 7page

arXiv.org e-Print Archive

Flightmare: A Flexible Quadrotor Simulator

Author: Kaufmann Elia
Loquercio Antonio
Naji Selim
Scaramuzza Davide
Song Yunlong
Publication venue
Publication date: 01/09/2020
Field of study

Currently available quadrotor simulators have a rigid and highly-specialized structure: either are they really fast, physically accurate, or photo-realistic. In this work, we propose a paradigm-shift in the development of simulators: moving the trade-off between accuracy and speed from the developers to the end-users. We use this design idea to develop a novel modular quadrotor simulator: Flightmare. Flightmare is composed of two main components: a configurable rendering engine built on Unity and a flexible physics engine for dynamics simulation. Those two components are totally decoupled and can run independently from each other. This makes our simulator extremely fast: rendering achieves speeds of up to 230 Hz, while physics simulation of up to 200,000 Hz. In addition, Flightmare comes with several desirable features: (i) a large multi-modal sensor suite, including an interface to extract the 3D point-cloud of the scene; (ii) an API for reinforcement learning which can simulate hundreds of quadrotors in parallel; and (iii) an integration with a virtual-reality headset for interaction with the simulated environment. We demonstrate the flexibility of Flightmare by using it for two completely different robotic tasks: learning a sensorimotor control policy for a quadrotor and path-planning in a complex 3D environment

arXiv.org e-Print Archive

ZORA

Gym-Ignition: Reproducible Robotic Simulations for Reinforcement Learning

Author: Ferigo Diego
Metta Giorgio
Pucci Daniele
Traversaro Silvio
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 02/12/2019
Field of study

This paper presents Gym-Ignition, a new framework to create reproducible robotic environments for reinforcement learning research. It interfaces with the new generation of Gazebo, part of the Ignition Robotics suite, which provides three main improvements for reinforcement learning applications compared to the alternatives: 1) the modular architecture enables using the simulator as a C++ library, simplifying the interconnection with external software; 2) multiple physics and rendering engines are supported as plugins, simplifying their selection during the execution; 3) the new distributed simulation capability allows simulating complex scenarios while sharing the load on multiple workers and machines. The core of Gym-Ignition is a component that contains the Ignition Gazebo simulator and exposes a simple interface for its configuration and execution. We provide a Python package that allows developers to create robotic environments simulated in Ignition Gazebo. Environments expose the common OpenAI Gym interface, making them compatible out-of-the-box with third-party frameworks containing reinforcement learning algorithms. Simulations can be executed in both headless and GUI mode, the physics engine can run in accelerated mode, and instances can be parallelized. Furthermore, the Gym-Ignition software architecture provides abstraction of the Robot and the Task, making environments agnostic on the specific runtime. This abstraction allows their execution also in a real-time setting on actual robotic platforms, even if driven by different middlewares.Comment: Accepted in SII202

arXiv.org e-Print Archive

Crossref

Guided Constrained Policy Optimization for Dynamic Quadrupedal Robot Locomotion

Author: Gangapurwala Siddhant
Havoutis Ioannis
Mitchell Alexander
Publication venue
Publication date: 01/01/2020
Field of study

Deep reinforcement learning (RL) uses model-free techniques to optimize task-specific control policies. Despite having emerged as a promising approach for complex problems, RL is still hard to use reliably for real-world applications. Apart from challenges such as precise reward function tuning, inaccurate sensing and actuation, and non-deterministic response, existing RL methods do not guarantee behavior within required safety constraints that are crucial for real robot scenarios. In this regard, we introduce guided constrained policy optimization (GCPO), an RL framework based upon our implementation of constrained proximal policy optimization (CPPO) for tracking base velocity commands while following the defined constraints. We also introduce schemes which encourage state recovery into constrained regions in case of constraint violations. We present experimental results of our training method and test it on the real ANYmal quadruped robot. We compare our approach against the unconstrained RL method and show that guided constrained RL offers faster convergence close to the desired optimum resulting in an optimal, yet physically feasible, robotic control behavior without the need for precise reward function tuning.Comment: 8 pages, 8 figures, 5 tables, 1 algorithm, accepted to IEEE Robotics and Automation Letters (RA-L), January 2020 with presentation at International Conference on Robotics and Automation (ICRA) 202

arXiv.org e-Print Archive

Oxford University Research Archive