Search CORE

890 research outputs found

Hierarchical Decomposition of Nonlinear Dynamics and Control for System Identification and Policy Distillation

Author: Abdulsamad Hany
Peters Jan
Publication venue
Publication date: 01/01/2020
Field of study

The control of nonlinear dynamical systems remains a major challenge for autonomous agents. Current trends in reinforcement learning (RL) focus on complex representations of dynamics and policies, which have yielded impressive results in solving a variety of hard control tasks. However, this new sophistication and extremely over-parameterized models have come with the cost of an overall reduction in our ability to interpret the resulting policies. In this paper, we take inspiration from the control community and apply the principles of hybrid switching systems in order to break down complex dynamics into simpler components. We exploit the rich representational power of probabilistic graphical models and derive an expectation-maximization (EM) algorithm for learning a sequence model to capture the temporal structure of the data and automatically decompose nonlinear dynamics into stochastic switching linear dynamical systems. Moreover, we show how this framework of switching models enables extracting hierarchies of Markovian and auto-regressive locally linear controllers from nonlinear experts in an imitation learning scenario.Comment: 2nd Annual Conference on Learning for Dynamics and Contro

arXiv.org e-Print Archive

A Linearly Constrained Nonparametric Framework for Imitation Learning

Author: Caldwell DG
Huang Y
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/09/2020
Field of study

In recent years, a myriad of advanced results have been reported in the community of imitation learning, ranging from parametric to non-parametric, probabilistic to non-probabilistic and Bayesian to frequentist approaches. Meanwhile, ample applications (e.g., grasping tasks and humanrobot collaborations) further show the applicability of imitation learning in a wide range of domains. While numerous literature is dedicated to the learning of human skills in unconstrained environments, the problem of learning constrained motor skills, however, has not received equal attention. In fact, constrained skills exist widely in robotic systems. For instance, when a robot is demanded to write letters on a board, its end-effector trajectory must comply with the plane constraint from the board. In this paper, we propose linearly constrained kernelized movement primitives (LC-KMP) to tackle the problem of imitation learning with linear constraints. Specifically, we propose to exploit the probabilistic properties of multiple demonstrations, and subsequently incorporate them into a linearly constrained optimization problem, which finally leads to a non-parametric solution. In addition, a connection between our framework and the classical model predictive control is provided. Several examples including simulated writing and locomotion tasks are presented to show the effectiveness of our framework

Hierarchical Learning Approach for One-shot Action Imitation in Humanoid Robots

Author: Demiris Y
Wu Y
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

Spiral - Imperial College Digital Repository

Fast, Robust, and Versatile Event Detection through HMM Belief State Gradient Measures

Author: Duan Shuangda
Guan Yisheng
Lin Hongbin
Luo Shuangqi
Rojas Juan
Wu Hongmin
Publication venue
Publication date: 19/06/2018
Field of study

Event detection is a critical feature in data-driven systems as it assists with the identification of nominal and anomalous behavior. Event detection is increasingly relevant in robotics as robots operate with greater autonomy in increasingly unstructured environments. In this work, we present an accurate, robust, fast, and versatile measure for skill and anomaly identification. A theoretical proof establishes the link between the derivative of the log-likelihood of the HMM filtered belief state and the latest emission probabilities. The key insight is the inverse relationship in which gradient analysis is used for skill and anomaly identification. Our measure showed better performance across all metrics than related state-of-the art works. The result is broadly applicable to domains that use HMMs for event detection.Comment: 8 pages, 7 figures, double col, ieee conference forma

arXiv.org e-Print Archive