Search CORE

25 research outputs found

Smooth Imitation Learning for Online Sequence Prediction

Author: Carr Peter
Kang Andrew
Le Hoang M.
Yue Yisong
Publication venue: PMLR
Publication date: 19/06/2016
Field of study

We study the problem of smooth imitation learning for online sequence prediction, where the goal is to train a policy that can smoothly imitate demonstrated behavior in a dynamic and continuous environment in response to online, sequential context input. Since the mapping from context to behavior is often complex, we take a learning reduction approach to reduce smooth imitation learning to a regression problem using complex function classes that are regularized to ensure smoothness. We present a learning meta-algorithm that achieves fast and stable convergence to a good policy. Our approach enjoys several attractive properties, including being fully deterministic, employing an adaptive learning rate that can provably yield larger policy improvements compared to previous approaches, and the ability to ensure stable convergence. Our empirical results demonstrate significant performance gains over previous approaches

Episodic Learning with Control Lyapunov Functions for Uncertain Robotic Systems

Author: Ames Aaron D.
Dorobantu Victor D.
Le Hoang M.
Taylor Andrew J.
Yue Yisong
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 04/03/2019
Field of study

Many modern nonlinear control methods aim to endow systems with guaranteed properties, such as stability or safety, and have been successfully applied to the domain of robotics. However, model uncertainty remains a persistent challenge, weakening theoretical guarantees and causing implementation failures on physical systems. This paper develops a machine learning framework centered around Control Lyapunov Functions (CLFs) to adapt to parametric uncertainty and unmodeled dynamics in general robotic systems. Our proposed method proceeds by iteratively updating estimates of Lyapunov function derivatives and improving controllers, ultimately yielding a stabilizing quadratic program model-based controller. We validate our approach on a planar Segway simulation, demonstrating substantial performance improvements by iteratively refining on a base model-free controller

arXiv.org e-Print Archive

Crossref

Caltech Authors

Coordinated Multi-Agent Imitation Learning

Author: Carr Peter
Le Hoang M.
Lucey Patrick
Yue Yisong
Publication venue
Publication date: 01/08/2017
Field of study

We study the problem of imitation learning from demonstrations of multiple coordinating agents. One key challenge in this setting is that learning a good model of coordination can be difficult, since coordination is often implicit in the demonstrations and must be inferred as a latent variable. We propose a joint approach that simultaneously learns a latent coordination model along with the individual policies. In particular, our method integrates unsupervised structure learning with conventional imitation learning. We illustrate the power of our approach on a difficult problem of learning multiple policies for fine-grained behavior modeling in team sports, where different players occupy different roles in the coordinated team strategy. We show that having a coordination model to infer the roles of players yields substantially improved imitation loss compared to conventional baselines.Comment: International Conference on Machine Learning 201

arXiv.org e-Print Archive

Caltech Authors

Control Regularization for Reduced Variance Reinforcement Learning

Author: Burdick Joel W.
Chaudhuri Swarat
Cheng Richard
Orosz Gabor
Verma Abhinav
Yue Yisong
Publication venue
Publication date: 13/05/2019
Field of study

Dealing with high variance is a significant challenge in model-free reinforcement learning (RL). Existing methods are unreliable, exhibiting high variance in performance from run to run using different initializations/seeds. Focusing on problems arising in continuous control, we propose a functional regularization approach to augmenting model-free RL. In particular, we regularize the behavior of the deep policy to be similar to a policy prior, i.e., we regularize in function space. We show that functional regularization yields a bias-variance trade-off, and propose an adaptive tuning strategy to optimize this trade-off. When the policy prior has control-theoretic stability guarantees, we further show that this regularization approximately preserves those stability guarantees throughout learning. We validate our approach empirically on a range of settings, and demonstrate significantly reduced variance, guaranteed dynamic stability, and more efficient learning than deep RL alone.Comment: Appearing in ICML 201

arXiv.org e-Print Archive

Caltech Authors

Neural-Swarm: Decentralized Close-Proximity Multirotor Control Using Learned Interactions

Author: Chung Soon-Jo
Hönig Wolfgang
Shi Guanya
Yue Yisong
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 05/03/2020
Field of study

In this paper, we present Neural-Swarm, a nonlinear decentralized stable controller for close-proximity flight of multirotor swarms. Close-proximity control is challenging due to the complex aerodynamic interaction effects between multirotors, such as downwash from higher vehicles to lower ones. Conventional methods often fail to properly capture these interaction effects, resulting in controllers that must maintain large safety distances between vehicles, and thus are not capable of close-proximity flight. Our approach combines a nominal dynamics model with a regularized permutation-invariant Deep Neural Network (DNN) that accurately learns the high-order multi-vehicle interactions. We design a stable nonlinear tracking controller using the learned model. Experimental results demonstrate that the proposed controller significantly outperforms a baseline nonlinear tracking controller with up to four times smaller worst-case height tracking errors. We also empirically demonstrate the ability of our learned model to generalize to larger swarm sizes

arXiv.org e-Print Archive

Crossref

Caltech Authors