A Policy Adaptation Method for Implicit Multitask Reinforcement Learning
  Problems

Morimoto, Jun; Yamamori, Satoshi

A Policy Adaptation Method for Implicit Multitask Reinforcement Learning Problems

Authors: Jun Morimoto
Satoshi Yamamori
Publication date: 31 August 2023
Publisher

Abstract

In dynamic motion generation tasks, including contact and collisions, small changes in policy parameters can lead to extremely different returns. For example, in soccer, the ball can fly in completely different directions with a similar heading motion by slightly changing the hitting position or the force applied to the ball or when the friction of the ball varies. However, it is difficult to imagine that completely different skills are needed for heading a ball in different directions. In this study, we proposed a multitask reinforcement learning algorithm for adapting a policy to implicit changes in goals or environments in a single motion category with different reward functions or physical parameters of the environment. We evaluated the proposed method on the ball heading task using a monopod robot model. The results showed that the proposed method can adapt to implicit changes in the goal positions or the coefficients of restitution of the ball, whereas the standard domain randomization approach cannot cope with different task settings.Comment: 12 pages, 9 figure

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2308.16471

Last time updated on 10/09/2023