Search CORE

79 research outputs found

Eligibility Propagation to Speed up Time Hopping for Reinforcement Learning

Author: Dong Fangyan
Hirota Kaoru
Kormushev Petar
Nomoto Kohei
Publication venue
Publication date: 03/04/2009
Field of study

A mechanism called Eligibility Propagation is proposed to speed up the Time Hopping technique used for faster Reinforcement Learning in simulations. Eligibility Propagation provides for Time Hopping similar abilities to what eligibility traces provide for conventional Reinforcement Learning. It propagates values from one state to all of its temporal predecessors using a state transitions graph. Experiments on a simulated biped crawling robot confirm that Eligibility Propagation accelerates the learning process more than 3 times.Comment: 7 page

arXiv.org e-Print Archive

Crossref

Spiral - Imperial College Digital Repository

Fast and Continuous Foothold Adaptation for Dynamic Locomotion through CNNs

Author: Barasuol Victor
Caldwell Darwin G.
Camurri Marco
Focchi Michele
Franceschi Luca
Pontil Massimiliano
Semini Claudio
Villarreal Octavio
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 15/02/2019
Field of study

Legged robots can outperform wheeled machines for most navigation tasks across unknown and rough terrains. For such tasks, visual feedback is a fundamental asset to provide robots with terrain-awareness. However, robust dynamic locomotion on difficult terrains with real-time performance guarantees remains a challenge. We present here a real-time, dynamic foothold adaptation strategy based on visual feedback. Our method adjusts the landing position of the feet in a fully reactive manner, using only on-board computers and sensors. The correction is computed and executed continuously along the swing phase trajectory of each leg. To efficiently adapt the landing position, we implement a self-supervised foothold classifier based on a Convolutional Neural Network (CNN). Our method results in an up to 200 times faster computation with respect to the full-blown heuristics. Our goal is to react to visual stimuli from the environment, bridging the gap between blind reactive locomotion and purely vision-based planning strategies. We assess the performance of our method on the dynamic quadruped robot HyQ, executing static and dynamic gaits (at speeds up to 0.5 m/s) in both simulated and real scenarios; the benefit of safe foothold adaptation is clearly demonstrated by the overall robot behavior.Comment: 9 pages, 11 figures. Accepted to RA-L + ICRA 2019, January 201

arXiv.org e-Print Archive

UCL Discovery

Learning Image-Conditioned Dynamics Models for Control of Under-actuated Legged Millirobots

Author: Asmar Thomas
Fearing Ronald S.
Kahn Gregory
Levine Sergey
Nagabandi Anusha
Pandya Ravi
Yang Guangzhao
Publication venue
Publication date: 30/03/2018
Field of study

Millirobots are a promising robotic platform for many applications due to their small size and low manufacturing costs. Legged millirobots, in particular, can provide increased mobility in complex environments and improved scaling of obstacles. However, controlling these small, highly dynamic, and underactuated legged systems is difficult. Hand-engineered controllers can sometimes control these legged millirobots, but they have difficulties with dynamic maneuvers and complex terrains. We present an approach for controlling a real-world legged millirobot that is based on learned neural network models. Using less than 17 minutes of data, our method can learn a predictive model of the robot's dynamics that can enable effective gaits to be synthesized on the fly for following user-specified waypoints on a given terrain. Furthermore, by leveraging expressive, high-capacity neural network models, our approach allows for these predictions to be directly conditioned on camera images, endowing the robot with the ability to predict how different terrains might affect its dynamics. This enables sample-efficient and effective learning for locomotion of a dynamic legged millirobot on various terrains, including gravel, turf, carpet, and styrofoam. Experiment videos can be found at https://sites.google.com/view/imageconddy

arXiv.org e-Print Archive

Crossref

Vision-based reinforcement learning using approximate policy iteration

Author: Duckett Tom
Shaker Marwan
Yue Shigang
Publication venue
Publication date: 01/01/2009
Field of study

A major issue for reinforcement learning (RL) applied to robotics is the time required to learn a new skill. While RL has been used to learn mobile robot control in many simulated domains, applications involving learning on real robots are still relatively rare. In this paper, the Least-Squares Policy Iteration (LSPI) reinforcement learning algorithm and a new model-based algorithm Least-Squares Policy Iteration with Prioritized Sweeping (LSPI+), are implemented on a mobile robot to acquire new skills quickly and efficiently. LSPI+ combines the benefits of LSPI and prioritized sweeping, which uses all previous experience to focus the computational effort on the most “interesting” or dynamic parts of the state space. The proposed algorithms are tested on a household vacuum cleaner robot for learning a docking task using vision as the only sensor modality. In experiments these algorithms are compared to other model-based and model-free RL algorithms. The results show that the number of trials required to learn the docking task is significantly reduced using LSPI compared to the other RL algorithms investigated, and that LSPI+ further improves on the performance of LSPI

University of Lincoln Institutional Repository

CiteSeerX

Fault Tolerant Free Gait and Footstep Planning for Hexapod Robot Based on Monte-Carlo Tree

Author: Ding Liang
Gao Haibo
Gong Zhaopei
Liu Guangjun
Wang Zhikai
Xu Peng
Zhou Ruyi
Publication venue
Publication date: 16/06/2020
Field of study

Legged robots can pass through complex field environments by selecting gaits and discrete footholds carefully. Traditional methods plan gait and foothold separately and treat them as the single-step optimal process. However, such processing causes its poor passability in a sparse foothold environment. This paper novelly proposes a coordinative planning method for hexapod robots that regards the planning of gait and foothold as a sequence optimization problem with the consideration of dealing with the harshness of the environment as leg fault. The Monte Carlo tree search algorithm(MCTS) is used to optimize the entire sequence. Two methods, FastMCTS, and SlidingMCTS are proposed to solve some defeats of the standard MCTS applicating in the field of legged robot planning. The proposed planning algorithm combines the fault-tolerant gait method to improve the passability of the algorithm. Finally, compared with other planning methods, experiments on terrains with different densities of footholds and artificially-designed challenging terrain are carried out to verify our methods. All results show that the proposed method dramatically improves the hexapod robot's ability to pass through sparse footholds environment

arXiv.org e-Print Archive