Search CORE

30 research outputs found

Deep Kernels for Optimizing Locomotion Controllers

Author: Antonova Rika
Atkeson Christopher G.
Rai Akshara
Publication venue
Publication date: 01/01/2017
Field of study

Sample efficiency is important when optimizing parameters of locomotion controllers, since hardware experiments are time consuming and expensive. Bayesian Optimization, a sample-efficient optimization framework, has recently been widely applied to address this problem, but further improvements in sample efficiency are needed for practical applicability to real-world robots and high-dimensional controllers. To address this, prior work has proposed using domain expertise for constructing custom distance metrics for locomotion. In this work we show how to learn such a distance metric automatically. We use a neural network to learn an informed distance metric from data obtained in high-fidelity simulations. We conduct experiments on two different controllers and robot architectures. First, we demonstrate improvement in sample efficiency when optimizing a 5-dimensional controller on the ATRIAS robot hardware. We then conduct simulation experiments to optimize a 16-dimensional controller for a 7-link robot model and obtain significant improvements even when optimizing in perturbed environments. This demonstrates that our approach is able to enhance sample efficiency for two different controllers, hence is a fitting candidate for further experiments on hardware in the future.Comment: (Rika Antonova and Akshara Rai contributed equally

arXiv.org e-Print Archive

Publikationer från KTH

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Sample Efficient Optimization for Learning Controllers for Bipedal Locomotion

Author: Antonova Rika
Atkeson Christopher G.
Rai Akshara
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

Learning policies for bipedal locomotion can be difficult, as experiments are expensive and simulation does not usually transfer well to hardware. To counter this, we need al- gorithms that are sample efficient and inherently safe. Bayesian Optimization is a powerful sample-efficient tool for optimizing non-convex black-box functions. However, its performance can degrade in higher dimensions. We develop a distance metric for bipedal locomotion that enhances the sample-efficiency of Bayesian Optimization and use it to train a 16 dimensional neuromuscular model for planar walking. This distance metric reflects some basic gait features of healthy walking and helps us quickly eliminate a majority of unstable controllers. With our approach we can learn policies for walking in less than 100 trials for a range of challenging settings. In simulation, we show results on two different costs and on various terrains including rough ground and ramps, sloping upwards and downwards. We also perturb our models with unknown inertial disturbances analogous with differences between simulation and hardware. These results are promising, as they indicate that this method can potentially be used to learn control policies on hardware.Comment: To appear in International Conference on Humanoid Robots (Humanoids '2016), IEEE-RAS. (Rika Antonova and Akshara Rai contributed equally

arXiv.org e-Print Archive

Publikationer från KTH

Crossref

Digitala Vetenskapliga Arkivet - Academic Archive On-line

MPG.PuRe

Learning Feedback Terms for Reactive Planning and Control

Author: Meier Franziska
Rai Akshara
Schaal Stefan
Sutanto Giovanni
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 03/03/2017
Field of study

With the advancement of robotics, machine learning, and machine perception, increasingly more robots will enter human environments to assist with daily tasks. However, dynamically-changing human environments requires reactive motion plans. Reactivity can be accomplished through replanning, e.g. model-predictive control, or through a reactive feedback policy that modifies on-going behavior in response to sensory events. In this paper, we investigate how to use machine learning to add reactivity to a previously learned nominal skilled behavior. We approach this by learning a reactive modification term for movement plans represented by nonlinear differential equations. In particular, we use dynamic movement primitives (DMPs) to represent a skill and a neural network to learn a reactive policy from human demonstrations. We use the well explored domain of obstacle avoidance for robot manipulation as a test bed. Our approach demonstrates how a neural network can be combined with physical insights to ensure robust behavior across different obstacle settings and movement durations. Evaluations on an anthropomorphic robotic system demonstrate the effectiveness of our work.Comment: 8 pages, accepted to be published at ICRA 2017 conferenc

arXiv.org e-Print Archive

Crossref

Bayesian Optimization Using Domain Knowledge on the ATRIAS Biped

Author: Antonova Rika
Atkeson Christopher G.
Geyer Hartmut
Martin William
Rai Akshara
Song Seungmoon
Publication venue
Publication date: 18/09/2017
Field of study

Controllers in robotics often consist of expert-designed heuristics, which can be hard to tune in higher dimensions. It is typical to use simulation to learn these parameters, but controllers learned in simulation often don't transfer to hardware. This necessitates optimization directly on hardware. However, collecting data on hardware can be expensive. This has led to a recent interest in adapting data-efficient learning techniques to robotics. One popular method is Bayesian Optimization (BO), a sample-efficient black-box optimization scheme, but its performance typically degrades in higher dimensions. We aim to overcome this problem by incorporating domain knowledge to reduce dimensionality in a meaningful way, with a focus on bipedal locomotion. In previous work, we proposed a transformation based on knowledge of human walking that projected a 16-dimensional controller to a 1-dimensional space. In simulation, this showed enhanced sample efficiency when optimizing human-inspired neuromuscular walking controllers on a humanoid model. In this paper, we present a generalized feature transform applicable to non-humanoid robot morphologies and evaluate it on the ATRIAS bipedal robot -- in simulation and on hardware. We present three different walking controllers; two are evaluated on the real robot. Our results show that this feature transform captures important aspects of walking and accelerates learning on hardware and simulation, as compared to traditional BO.Comment: 8 pages, submitted to IEEE International Conference on Robotics and Automation 201

arXiv.org e-Print Archive

Crossref