12 research outputs found
A Dynamical System-based Approach to Modeling Stable Robot Control Policies via Imitation Learning
Despite tremendous advances in robotics, we are still amazed by the proficiency with which humans perform movements. Even new waves of robotic systems still rely heavily on hardcoded motions with a limited ability to react autonomously and robustly to a dynamically changing environment. This thesis focuses on providing possible mechanisms to push the level of adaptivity, reactivity, and robustness of robotic systems closer to human movements. Specifically, it aims at developing these mechanisms for a subclass of robot motions called “reaching movements”, i.e. movements in space stopping at a given target (also referred to as episodic motions, discrete motions, or point-to-point motions). These reaching movements can then be used as building blocks to form more advanced robot tasks. To achieve a high level of proficiency as described above, this thesis particularly seeks to derive control policies that: 1) resemble human motions, 2) guarantee the accomplishment of the task (if the target is reachable), and 3) can instantly adapt to changes in dynamic environments. To avoid manually hardcoding robot motions, this thesis exploits the power of machine learning techniques and takes an Imitation Learning (IL) approach to build a generic model of robot movements from a few examples provided by an expert. To achieve the required level of robustness and reactivity, the perspective adopted in this thesis is that a reaching movement can be described with a nonlinear Dynamical System (DS). When building an estimate of DS from demonstrations, there are two key problems that need to be addressed: the problem of generating motions that resemble at best the demonstrations (the “how-to-imitate” problem), and most importantly, the problem of ensuring the accomplishment of the task, i.e. reaching the target (the “stability” problem). Although there are numerous well-established approaches in robotics that could answer each of these problems separately, tackling both problems simultaneously is challenging and has not been extensively studied yet. This thesis first tackles the problem mentioned above by introducing an iterative method to build an estimate of autonomous nonlinear DS that are formulated as a mixture of Gaussian functions. This method minimizes the number of Gaussian functions required for achieving both local asymptotic stability at the target and accuracy in following demonstrations. We then extend this formulation and provide sufficient conditions to ensure global asymptotic stability of autonomous DS at the target. In this approach, an estimation of the underlying DS is built by solving a constraint optimization problem, where the metric of accuracy and the stability conditions are formulated as the optimization objective and constraints, respectively. In addition to ensuring convergence of all motions to the target within the local or global stability regions, these approaches offer an inherent adaptability and robustness to changes in dynamic environments. This thesis further extends the previous approaches and ensures global asymptotic stability of DS-based motions at the target independently of the choice of the regression technique. Therefore, it offers the possibility to choose the most appropriate regression technique based on the requirements of the task at hand without compromising DS stability. This approach also provides the possibility of online learning and using a combination of two or more regression methods to model more advanced robot tasks, and can be applied to estimate motions that are represented with both autonomous and non-autonomous DS. Additionally, this thesis suggests a reformulation to modeling robot motions that allows encoding of a considerably wider set of tasks ranging from reaching movements to agile robot movements that require hitting a given target with a specific speed and direction. This approach is validated in the context of playing the challenging task of minigolf. Finally, the last part of this thesis proposes a DS-based approach to realtime obstacle avoidance. The presented approach provides a modulation that instantly modifies the robot’s motion to avoid collision with multiple static and moving convex obstacles. This approach can be applied on all the techniques described above without affecting their adaptability, swiftness, or robustness. The techniques that are developed in this thesis have been validated in simulation and on different robotic platforms including the humanoid robots HOAP-3 and iCub, and the robot arms KATANA, WAM, and LWR. Throughout this thesis we show that the DS-based approach to modeling robot discrete movements can offer a high level of adaptability, reactivity, and robustness almost effortlessly when interacting with dynamic environments
Learning Stable Non-Linear Dynamical Systems with Gaussian Mixture Models
This paper presents a method for learning discrete robot motions from a set of demonstrations. We model a motion as a nonlinear autonomous (i.e. time-invariant) Dynamical System (DS), and define sufficient conditions to ensure global asymptotic stability at the target. We propose a learning method, called Stable Estimator of Dynamical Systems (SEDS), to learn the parameters of the DS to ensure that all motions follow closely the demonstrations while ultimately reaching in and stopping at the target. Time-invariance and global asymptotic stability at the target ensures that the system can respond immediately and appropriately to perturbations encountered during the motion. The method is evaluated through a set of robot experiments and on a library of human handwriting motions
BM: An Iterative Method to Learn Stable Non-Linear Dynamical Systems with Gaussian Mixture Models
We model the dynamics of non-linear discrete (i.e. point-to- point) robot motions as a time-independent system described by an autonomous dynamical system (DS). We propose an iterative algorithm to estimate the form of the DS through a mixture of Gaussian distributions. We prove that the resulting model is asymptotically stable at the target. We validate the accuracy of the model on a library of 2D human motions and to learn a control policy through human demonstrations for two multi- degrees of freedom robots. We show the real-time adaptation to perturbations of the learned model when controlling the two kinematically-driven robots
Learning and Control of UAV maneuvers Based on Demonstrations
Many maneuvers of Unmanned Aerial Vehicles (UAV) can be considered within a framework of trajectory following. Though this issue can differ from one application to another, they all share the same problem of finding an optimal path (or signal) to perform the specified task. Finding this optimal trajectory is a challenging issue since it depends on both having an accurate mathematical model of the UAV, and designing the desired trajectory based on this dynamical model. %The former is usually tackled by estimating the nonlinear model with a locally linear model around the desired states, while the latter is mostly relaxed by defining the desired path with a polynomial or spline curves, and then designing a controller to follow it. However, there still remains some ambiguities about the accuracy and performance of the result. In response to these concerns, statistical modeling approaches have proved to be interesting alternatives to classical control and planning approaches for modeling of the intrinsic dynamics of the robot's body when it cannot be well estimated. Furthermore, within the framework of programming by demonstration (PbD), statistical methods have been proposed as means to learn a generic trajectory across sets of demonstration. In this work, we implemented an algorithm based on PbD to estimate both dynamics of the UAV and to infer the underlying maneuver. The main advantage of the proposed algorithm over our previous works lies in the fact that with this modeling approach, the effect of robot's dynamics is taken into account
Learning to Control Planar Hitting Motions in a Monigolf-like Task
A current trend in robotics is to define robot tasks using a combination of superimposed motion patterns. For maximum versatility of such motion patterns, they should be easily and efficiently adaptable for situations beyond those for which the motion was originally designed. In this work, we show how a challenging minigolf-like task can be efficiently learned by the robot using a basic hitting motion model and a task-specific adaptation of the hitting parameters: hitting speed and hitting angle. We propose an approach to learn the hitting parameters for a minigolf field using a set of provided examples. This is a non- trivial problem since the successful choice of hitting parameters generally represent a highly non-linear, multi-valued map from the situation-representation to the hitting parameters. We show that by limiting the problem to learning one combination of hitting parameters for each input, a high-performance model of the hitting parameters can be learned using only a small set of training data. We compare two statistical methods, Gaussian Process Regression (GPR) and Gaussian Mixture Regression (GMR) in the context of inferring hitting parameters for the minigolf task. We validate our approach on the 7 degrees of freedom Barrett WAM robotic arm in both a simulated and real environment
Open-source benchmarking for learned reaching motion generation in robotics
Lemme A, Meirovitch Y, Khansari-Zadeh SM, Flash T, Billard A, Steil JJ. Open-source benchmarking for learned reaching motion generation in robotics. Paladyn, Journal of Behavioral Robotics. 2015;6(1):30-41
Multi-criteria benchmarking of movement generating dynamical systems for learning-from-demonstrations
Lemme A, Meirovitch Y, Khansari-Zadeh SM, Flash T, Billard A, Steil JJ. Multi-criteria benchmarking of movement generating dynamical systems for learning-from-demonstrations. Bielefeld University; 2014.This MATLAB benchmark framework was developed to compare different methods for generating goal directed trajectories and extract their specificities, strengths and weaknesses. It allows each user to configure different perturbations which can occur during a movement execution and prepare their models for the given task before a baseline parameter set is used to create comparable results.
For more information, please refer to the entry in CITEC's Cognitive Interaction Toolkit Catalogue (CITK):
http://toolkit.cit-ec.uni-bielefeld.de/datasets/amarsi-benchmark-framework
Software updates und support:
https://opensource.cit-ec.de/projects/amars