6 research outputs found
An implementation of vision based deep reinforcement learning for humanoid robot locomotion
Deep reinforcement learning (DRL) exhibits a
promising approach for controlling humanoid robot
locomotion. However, only values relating sensors such as IMU,
gyroscope, and GPS are not sufficient robots to learn their
locomotion skills. In this article, we aim to show the success of
vision based DRL. We propose a new vision based deep
reinforcement learning algorithm for the locomotion of the
Robotis-op2 humanoid robot for the first time. In experimental
setup, we construct the locomotion of humanoid robot in a
specific environment in the Webots software. We use Double
Dueling Q Networks (D3QN) and Deep Q Networks (DQN) that
are a kind of reinforcement learning algorithm. We present the
performance of vision based DRL algorithm on a locomotion
experiment. The experimental results show that D3QN is better
than DQN in that stable locomotion and fast training and the
vision based DRL algorithms will be successfully able to use at
the other complex environments and applications.TÜBİTAK ve NVIDI
An implementation of vision based deep reinforcement learning for humanoid robot locomotion
An implementation of vision based deep reinforcement learning for humanoid robot locomotionTÜBİTAK ve NVIDI
Monte-Carlo Planning for Agile Legged Locomotion
Recent progress in legged locomotion research has produced robots that can perform agile blind-walking with robustness comparable to a blindfolded human. However, this walking approach has not yet been integrated with planners for high-level activities. In this paper, we take a step towards high-level task planning for these robots by studying a planar simulated biped that captures their essential dynamics. We investigate variants of Monte-Carlo Tree Search (MCTS) for selecting an appropriate blind-walking controller at each decision cycle. In particular, we consider UCT with an intelligently selected rollout policy, which is shown to be capable of guiding the biped through treacherous terrain. In addition, we develop a new MCTS variant, called Monte-Carlo Discrepancy Search (MCDS), which is shown to make more effective use of limited planning time than UCT for this domain. We demonstrate the effectiveness of these planners in both deterministic and stochastic environments across a range of algorithm parameters. In addition, we present results for using these planners to control a full-order 3D simulation of Cassie, an agile bipedal robot, through complex terrain