Search CORE

399 research outputs found

Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning

Author: Fearing Ronald S.
Kahn Gregory
Levine Sergey
Nagabandi Anusha
Publication venue
Publication date: 01/12/2017
Field of study

Model-free deep reinforcement learning algorithms have been shown to be capable of learning a wide range of robotic skills, but typically require a very large number of samples to achieve good performance. Model-based algorithms, in principle, can provide for much more efficient learning, but have proven difficult to extend to expressive, high-capacity models such as deep neural networks. In this work, we demonstrate that medium-sized neural network models can in fact be combined with model predictive control (MPC) to achieve excellent sample complexity in a model-based reinforcement learning algorithm, producing stable and plausible gaits to accomplish various complex locomotion tasks. We also propose using deep neural network dynamics models to initialize a model-free learner, in order to combine the sample efficiency of model-based approaches with the high task-specific performance of model-free methods. We empirically demonstrate on MuJoCo locomotion tasks that our pure model-based approach trained on just random action data can follow arbitrary trajectories with excellent sample efficiency, and that our hybrid algorithm can accelerate model-free learning on high-speed benchmark tasks, achieving sample efficiency gains of 3-5x on swimmer, cheetah, hopper, and ant agents. Videos can be found at https://sites.google.com/view/mbm

arXiv.org e-Print Archive

Crossref

Learning to stop: a unifying principle for legged locomotion in varying environments.

Author: Calisti M
George Thuruthel Thomas
Iida F
Laschi C
Picardi G
Publication venue: R Soc Open Sci
Publication date: 01/01/2021
Field of study

Evolutionary studies have unequivocally proven the transition of living organisms from water to land. Consequently, it can be deduced that locomotion strategies must have evolved from one environment to the other. However, the mechanism by which this transition happened and its implications on bio-mechanical studies and robotics research have not been explored in detail. This paper presents a unifying control strategy for locomotion in varying environments based on the principle of 'learning to stop'. Using a common reinforcement learning framework, deep deterministic policy gradient, we show that our proposed learning strategy facilitates a fast and safe methodology for transferring learned controllers from the facile water environment to the harsh land environment. Our results not only propose a plausible mechanism for safe and quick transition of locomotion strategies from a water to land environment but also provide a novel alternative for safer and faster training of robots

University of Lincoln Institutional Repository

Directory of Open Access Journals

Archivio della ricerca della Scuola Superiore Sant'Anna

Apollo (Cambridge)

CPG Implementations for Robot Locomotion: Analysis and Design

Author: Cesar Torres-Huitzil
Jose Hugo Barron-Zambrano
Publication venue: 'IntechOpen'
Publication date: 03/02/2012
Field of study

IntechOpen

From Knowing to Doing: Learning Diverse Motor Skills through Instruction Learning

Author: Cheng Yi
Li Jiayi
Liang Bin
Peng Yan
Wang Xianhao
Ye Linqi
Publication venue
Publication date: 17/09/2023
Field of study

Recent years have witnessed many successful trials in the robot learning field. For contact-rich robotic tasks, it is challenging to learn coordinated motor skills by reinforcement learning. Imitation learning solves this problem by using a mimic reward to encourage the robot to track a given reference trajectory. However, imitation learning is not so efficient and may constrain the learned motion. In this paper, we propose instruction learning, which is inspired by the human learning process and is highly efficient, flexible, and versatile for robot motion learning. Instead of using a reference signal in the reward, instruction learning applies a reference signal directly as a feedforward action, and it is combined with a feedback action learned by reinforcement learning to control the robot. Besides, we propose the action bounding technique and remove the mimic reward, which is shown to be crucial for efficient and flexible learning. We compare the performance of instruction learning with imitation learning, indicating that instruction learning can greatly speed up the training process and guarantee learning the desired motion correctly. The effectiveness of instruction learning is validated through a bunch of motion learning examples for a biped robot and a quadruped robot, where skills can be learned typically within several million steps. Besides, we also conduct sim-to-real transfer and online learning experiments on a real quadruped robot. Instruction learning has shown great merits and potential, making it a promising alternative for imitation learning

arXiv.org e-Print Archive