3 research outputs found
Learning Humanoid Robot Motions Through Deep Neural Networks
Controlling a high degrees of freedom humanoid robot is acknowledged as one
of the hardest problems in Robotics. Due to the lack of mathematical models, an
approach frequently employed is to rely on human intuition to design keyframe
movements by hand, usually aided by graphical tools. In this paper, we propose
a learning framework based on neural networks in order to mimic humanoid robot
movements. The developed technique does not make any assumption about the
underlying implementation of the movement, therefore both keyframe and
model-based motions may be learned. The framework was applied in the RoboCup 3D
Soccer Simulation domain and promising results were obtained using the same
network architecture for several motions, even when copying motions from
another teams
Learning Humanoid Robot Running Skills through Proximal Policy Optimization
In the current level of evolution of Soccer 3D, motion control is a key
factor in team's performance. Recent works takes advantages of model-free
approaches based on Machine Learning to exploit robot dynamics in order to
obtain faster locomotion skills, achieving running policies and, therefore,
opening a new research direction in the Soccer 3D environment.
In this work, we present a methodology based on Deep Reinforcement Learning
that learns running skills without any prior knowledge, using a neural network
whose inputs are related to robot's dynamics. Our results outperformed the
previous state-of-the-art sprint velocity reported in Soccer 3D literature by a
significant margin. It also demonstrated improvement in sample efficiency,
being able to learn how to run in just few hours.
We reported our results analyzing the training procedure and also evaluating
the policies in terms of speed, reliability and human similarity. Finally, we
presented key factors that lead us to improve previous results and shared some
ideas for future work
Bottom-Up Meta-Policy Search
Despite of the recent progress in agents that learn through interaction,
there are several challenges in terms of sample efficiency and generalization
across unseen behaviors during training. To mitigate these problems, we propose
and apply a first-order Meta-Learning algorithm called Bottom-Up Meta-Policy
Search (BUMPS), which works with two-phase optimization procedure: firstly, in
a meta-training phase, it distills few expert policies to create a meta-policy
capable of generalizing knowledge to unseen tasks during training; secondly, it
applies a fast adaptation strategy named Policy Filtering, which evaluates few
policies sampled from the meta-policy distribution and selects which best
solves the task. We conducted all experiments in the RoboCup 3D Soccer
Simulation domain, in the context of kick motion learning. We show that, given
our experimental setup, BUMPS works in scenarios where simple multi-task
Reinforcement Learning does not. Finally, we performed experiments in a way to
evaluate each component of the algorithm