    Online Discrimination of Nonlinear Dynamics with Switching Differential Equations

    How to recognise whether an observed person walks or runs? We consider a dynamic environment where observations (e.g. the posture of a person) are caused by different dynamic processes (walking or running) which are active one at a time and which may transition from one to another at any time. For this setup, switching dynamic models have been suggested previously, mostly, for linear and nonlinear dynamics in discrete time. Motivated by basic principles of computations in the brain (dynamic, internal models) we suggest a model for switching nonlinear differential equations. The switching process in the model is implemented by a Hopfield network and we use parametric dynamic movement primitives to represent arbitrary rhythmic motions. The model generates observed dynamics by linearly interpolating the primitives weighted by the switching variables and it is constructed such that standard filtering algorithms can be applied. In two experiments with synthetic planar motion and a human motion capture data set we show that inference with the unscented Kalman filter can successfully discriminate several dynamic processes online

    Learning Feedback Terms for Reactive Planning and Control

    With the advancement of robotics, machine learning, and machine perception, increasingly more robots will enter human environments to assist with daily tasks. However, dynamically-changing human environments requires reactive motion plans. Reactivity can be accomplished through replanning, e.g. model-predictive control, or through a reactive feedback policy that modifies on-going behavior in response to sensory events. In this paper, we investigate how to use machine learning to add reactivity to a previously learned nominal skilled behavior. We approach this by learning a reactive modification term for movement plans represented by nonlinear differential equations. In particular, we use dynamic movement primitives (DMPs) to represent a skill and a neural network to learn a reactive policy from human demonstrations. We use the well explored domain of obstacle avoidance for robot manipulation as a test bed. Our approach demonstrates how a neural network can be combined with physical insights to ensure robust behavior across different obstacle settings and movement durations. Evaluations on an anthropomorphic robotic system demonstrate the effectiveness of our work.Comment: 8 pages, accepted to be published at ICRA 2017 conferenc

    Rich periodic motor skills on humanoid robots: Riding the pedal racer

    Just as their discrete counterparts, periodic or rhythmic dynamic motion primitives allow easily modulated and robust motion generation, but for periodic tasks. In this paper we present an approach for modulating periodic dynamic movement primitives based on force feedback, allowing for rich motor behavior and skills. We propose and evaluate the combination of feedback and learned feed-forward terms to fully adapt the motions of a robot in order to achieve a desired force interaction with the environment. For the learning we employ the notion of repetitive control, which can effectively minimize the error of behavior towards a given reference. To demonstrate the approach, we show results of simulated and real world experiments on a compliant humanoid robot COMAN. We show the initial results of utilizing the approach to control a pedal-racer, a demanding balance toy best described as a hybrid between a skateboard and a bicycle. © 2014 IEEE

    Effects of Robotic Knee Exoskeleton on Human Energy Expenditure

    Supervised Learning and Reinforcement Learning of Feedback Models for Reactive Behaviors: Tactile Feedback Testbed

    Robots need to be able to adapt to unexpected changes in the environment such that they can autonomously succeed in their tasks. However, hand-designing feedback models for adaptation is tedious, if at all possible, making data-driven methods a promising alternative. In this paper we introduce a full framework for learning feedback models for reactive motion planning. Our pipeline starts by segmenting demonstrations of a complete task into motion primitives via a semi-automated segmentation algorithm. Then, given additional demonstrations of successful adaptation behaviors, we learn initial feedback models through learning from demonstrations. In the final phase, a sample-efficient reinforcement learning algorithm fine-tunes these feedback models for novel task settings through few real system interactions. We evaluate our approach on a real anthropomorphic robot in learning a tactile feedback task.Comment: Submitted to the International Journal of Robotics Research. Paper length is 21 pages (including references) with 12 figures. A video overview of the reinforcement learning experiment on the real robot can be seen at https://www.youtube.com/watch?v=WDq1rcupVM0. arXiv admin note: text overlap with arXiv:1710.0855

    Coupling Movement Primitives: Interaction With the Environment and Bimanual Tasks

    The framework of dynamic movement primitives (DMPs) contains many favorable properties for the execution of robotic trajectories, such as indirect dependence on time, response to perturbations, and the ability to easily modulate the given trajectories, but the framework in its original form remains constrained to the kinematic aspect of the movement. In this paper, we bridge the gap to dynamic behavior by extending the framework with force/torque feedback. We propose and evaluate a modulation approach that allows interaction with objects and the environment. Through the proposed coupling of originally independent robotic trajectories, the approach also enables the execution of bimanual and tightly coupled cooperative tasks. We apply an iterative learning control algorithm to learn a coupling term, which is applied to the original trajectory in a feed-forward fashion and, thus, modifies the trajectory in accordance to the desired positions or external forces. A stability analysis and results of simulated and real-world experiments using two KUKA LWR arms for bimanual tasks and interaction with the environment are presented. By expanding on the framework of DMPs, we keep all the favorable properties, which is demonstrated with temporal modulation and in a two-agent obstacle avoidance task

    Using Reinforcement Learning in the tuning of Central Pattern Generators

    Dissertação de mestrado em Engenharia InformáticaÉ objetivo deste trabalho aplicar técnicas de Reinforcement Learning em tarefas de aprendizagem e locomoção de robôs. Reinforcement Learning é uma técnica de aprendizagem útil no que diz respeito à locomoção de robôs, devido à ênfase que dá à interação direta entre o agente e o meio ambiente, e ao facto de não exigir supervisão ou modelos completos, ao contrário do que acontece nas abordagens clássicas. O objetivo desta técnica consiste na decisão das ações a tomar, de forma a maximizar uma recompensa cumulativa, tendo em conta o facto de que as decisões podem afetar não só as recompensas imediatas, como também as futuras. Neste trabalho será apresentada a estrutura e funcionamento do Reinforcement Learning e a sua aplicação em Central Pattern Generators, com o objetivo de gerar locomoção adaptativa otimizada. De forma a investigar e identificar os pontos fortes e capacidades do Reinforcement Learning, e para demonstrar de uma forma simples este tipo de algoritmos, foram implementados dois casos de estudo baseados no estado da arte. No que diz respeito ao objetivo principal desta tese, duas soluções diferentes foram abordadas: uma primeira baseada em métodos Natural-Actor Critic, e a segunda, em Cross-Entropy Method. Este último algoritmo provou ser capaz de lidar com a integração das duas abordagens propostas. As soluções de integração foram testadas e validadas com recurso ao simulador Webots e ao modelo do robô DARwIN-OP.In this work, it is intended to apply Reinforcement Learning techniques in tasks involving learning and robot locomotion. Reinforcement Learning is a very useful learning technique with regard to legged robot locomotion, due to its ability to provide direct interaction between the agent and the environment, and the fact of not requiring supervision or complete models, in contrast with other classic approaches. Its aim consists in making decisions about which actions to take so as to maximize a cumulative reward or reinforcement signal, taking into account the fact that the decisions may affect not only the immediate reward, but also the future ones. In this work it will be studied and presented the Reinforcement Learning framework and its application in the tuning of Central Pattern Generators, with the aim of generating optimized robot locomotion. In order to investigate the strengths and abilities of Reinforcement Learning, and to demonstrate in a simple way the learning process of such algorithms, two case studies were implemented based on the state-of-the-art. With regard to the main purpose of the thesis, two different solutions are addressed: a first one based on Natural-Actor Critic methods, and a second, based on the Cross-Entropy Method. This last algorithm was found to be very capable of handling with the integration of the two proposed approaches. The integration solutions were tested and validated resorting to Webots simulation and DARwIN-OP robot model