Search CORE

3 research outputs found

Neural probabilistic motor primitives for humanoid control

Author: Ahuja Arun
Galashov Alexandre
Hasenclever Leonard
Heess Nicolas
Merel Josh
Pham Vu
Teh Yee Whye
Wayne Greg
Publication venue
Publication date: 01/01/2019
Field of study

We focus on the problem of learning a single motor module that can flexibly express a range of behaviors for the control of high-dimensional physically simulated humanoids. To do this, we propose a motor architecture that has the general structure of an inverse model with a latent-variable bottleneck. We show that it is possible to train this model entirely offline to compress thousands of expert policies and learn a motor primitive embedding space. The trained neural probabilistic motor primitive system can perform one-shot imitation of whole-body humanoid behaviors, robustly mimicking unseen trajectories. Additionally, we demonstrate that it is also straightforward to train controllers to reuse the learned motor primitive space to solve tasks, and the resulting movements are relatively naturalistic. To support the training of our model, we compare two approaches for offline policy cloning, including an experience efficient method which we call linear feedback policy cloning. We encourage readers to view a supplementary video ( https://youtu.be/CaDEf-QcKwA ) summarizing our results.Comment: Accepted as a conference paper at ICLR 201

arXiv.org e-Print Archive

Oxford University Research Archive

A physics-based Juggling Simulation using Reinforcement Learning

Author: 제이슨
Publication venue: 서울대학교 대학원
Publication date: 01/02/2019
Field of study

학위논문 (석사)-- 서울대학교 대학원 : 공과대학 컴퓨터공학부, 2019. 2. Lee, Jehee.Juggling is a physical skill which consists in keeping one or several objects in continuous motion in the air by tossing and catching it. Jugglers need a high dexterity to control their throws and catches which require speed, accuracy and synchronization. Also, the more balls we juggle with, the more those qualities have to be strong to achieve this performance. This thesis follows a previous project made by Lee et al.[1] where they performed juggling to demonstrate their method. In this work, we want to generalize the juggling skill and create a real time simulation by using machine learning. A reason to choose this skill is that Studying the ability to toss and catch balls and rings provides insight into human coordination, robotics and mathematics as written in the article Science of Juggling[2]. That is why juggling can be a good challenge for realistic physical based simulation to improve our knowledge on these fields, but also to help jugglers to evaluate the feasibility of their tricks. In order to do it, we have to understand all the different notations used in juggling and to apply the mathematical theory of juggling to reproduce it. In this thesis, we find an approach to learn juggling. We first break the need of synchronization of both hands by dividing our character in two. Then we divide the juggling into two subtasks catching and throwing a ball, where we present a deep reinforcement learning method for both of them. Finally, we use these tasks sequentially on both sides of the body to recreate the all juggling process. As a result, our character learns to catch all balls randomly thrown to him and to throw it at the velocity wanted. After combination of both subtasks, our juggler is able to react accurately and with enough speed and power to juggle up to 6 balls, even with external forces applied on it.I. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 II. Juggling theory . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.1 Notation and Parameters . . . . . . . . . . . . . . . . . . . 4 2.2 Juggling patterns . . . . . . . . . . . . . . . . . . . . . . . 6 III. Approach to learn juggling . . . . . . . . . . . . . . . . . . . 9 3.1 Juggling sequence . . . . . . . . . . . . . . . . . . . . . . . 9 3.2 Reinforcement learning . . . . . . . . . . . . . . . . . . . . 10 3.2.1 Definition . . . . . . . . . . . . . . . . . . . . . . . 11 3.2.2 Advantages . . . . . . . . . . . . . . . . . . . . . . 13 3.3 Rewards for Juggling . . . . . . . . . . . . . . . . . . . . . 14 3.3.1 Catching . . . . . . . . . . . . . . . . . . . . . . . 14 3.3.2 Throwing . . . . . . . . . . . . . . . . . . . . . . . 15 IV. Experiments and Results . . . . . . . . . . . . . . . . . . . . 17 4.1 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4.1.1 States . . . . . . . . . . . . . . . . . . . . . . . . . 18 4.1.2 Actions . . . . . . . . . . . . . . . . . . . . . . . . 19 4.1.3 Environment of our Simulation . . . . . . . . . . . . 20 4.2 Subtasks results . . . . . . . . . . . . . . . . . . . . . . . . 21 iii 4.2.1 Throwing . . . . . . . . . . . . . . . . . . . . . . . 21 4.2.2 Catching . . . . . . . . . . . . . . . . . . . . . . . 22 4.3 Performing juggling . . . . . . . . . . . . . . . . . . . . . . 25 4.3.1 Results . . . . . . . . . . . . . . . . . . . . . . . . 25 4.3.2 Add new ball while juggling . . . . . . . . . . . . . 26 V. Toward a 3D juggling . . . . . . . . . . . . . . . . . . . . . . 28 5.1 Catching . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 5.2 Throwing . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 5.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 VI. Discussion and Conclusion . . . . . . . . . . . . . . . . . . . 33 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . 37Maste

SNU Open Repository and Archive