559 research outputs found
Sample Efficient Optimization for Learning Controllers for Bipedal Locomotion
Learning policies for bipedal locomotion can be difficult, as experiments are
expensive and simulation does not usually transfer well to hardware. To counter
this, we need al- gorithms that are sample efficient and inherently safe.
Bayesian Optimization is a powerful sample-efficient tool for optimizing
non-convex black-box functions. However, its performance can degrade in higher
dimensions. We develop a distance metric for bipedal locomotion that enhances
the sample-efficiency of Bayesian Optimization and use it to train a 16
dimensional neuromuscular model for planar walking. This distance metric
reflects some basic gait features of healthy walking and helps us quickly
eliminate a majority of unstable controllers. With our approach we can learn
policies for walking in less than 100 trials for a range of challenging
settings. In simulation, we show results on two different costs and on various
terrains including rough ground and ramps, sloping upwards and downwards. We
also perturb our models with unknown inertial disturbances analogous with
differences between simulation and hardware. These results are promising, as
they indicate that this method can potentially be used to learn control
policies on hardware.Comment: To appear in International Conference on Humanoid Robots (Humanoids
'2016), IEEE-RAS. (Rika Antonova and Akshara Rai contributed equally
Continuous-Time Reinforcement Learning: New Design Algorithms with Theoretical Insights and Performance Guarantees
Continuous-time nonlinear optimal control problems hold great promise in
real-world applications. After decades of development, reinforcement learning
(RL) has achieved some of the greatest successes as a general nonlinear control
design method. However, a recent comprehensive analysis of state-of-the-art
continuous-time RL (CT-RL) methods, namely, adaptive dynamic programming
(ADP)-based CT-RL algorithms, reveals they face significant design challenges
due to their complexity, numerical conditioning, and dimensional scaling
issues. Despite advanced theoretical results, existing ADP CT-RL synthesis
methods are inadequate in solving even small, academic problems. The goal of
this work is thus to introduce a suite of new CT-RL algorithms for control of
affine nonlinear systems. Our design approach relies on two important factors.
First, our methods are applicable to physical systems that can be partitioned
into smaller subproblems. This constructive consideration results in reduced
dimensionality and greatly improved intuitiveness of design. Second, we
introduce a new excitation framework to improve persistence of excitation (PE)
and numerical conditioning performance via classical input/output insights.
Such a design-centric approach is the first of its kind in the ADP CT-RL
community. In this paper, we progressively introduce a suite of (decentralized)
excitable integral reinforcement learning (EIRL) algorithms. We provide
convergence and closed-loop stability guarantees, and we demonstrate these
guarantees on a significant application problem of controlling an unstable,
nonminimum phase hypersonic vehicle (HSV)
Bayesian Optimization for Learning Gaits under Uncertainty
© 2015, Springer International Publishing Switzerland.Designing gaits and corresponding control policies is a key challenge in robot locomotion. Even with a viable controller parametrization, finding near-optimal parameters can be daunting. Typically, this kind of parameter optimization requires specific expert knowledge and extensive robot experiments. Automatic black-box gait optimization methods greatly reduce the need for human expertise and time-consuming design processes. Many different approaches for automatic gait optimization have been suggested to date. However, no extensive comparison among them has yet been performed. In this article, we thoroughly discuss multiple automatic optimization methods in the context of gait optimization. We extensively evaluate Bayesian optimization, a model-based approach to black-box optimization under uncertainty, on both simulated problems and real robots. This evaluation demonstrates that Bayesian optimization is particularly suited for robotic applications, where it is crucial to find a good set of gait parameters in a small number of experiments
- …