16 research outputs found
Online Discrimination of Nonlinear Dynamics with Switching Differential Equations
How to recognise whether an observed person walks or runs? We consider a
dynamic environment where observations (e.g. the posture of a person) are
caused by different dynamic processes (walking or running) which are active one
at a time and which may transition from one to another at any time. For this
setup, switching dynamic models have been suggested previously, mostly, for
linear and nonlinear dynamics in discrete time. Motivated by basic principles
of computations in the brain (dynamic, internal models) we suggest a model for
switching nonlinear differential equations. The switching process in the model
is implemented by a Hopfield network and we use parametric dynamic movement
primitives to represent arbitrary rhythmic motions. The model generates
observed dynamics by linearly interpolating the primitives weighted by the
switching variables and it is constructed such that standard filtering
algorithms can be applied. In two experiments with synthetic planar motion and
a human motion capture data set we show that inference with the unscented
Kalman filter can successfully discriminate several dynamic processes online
Learning Feedback Terms for Reactive Planning and Control
With the advancement of robotics, machine learning, and machine perception,
increasingly more robots will enter human environments to assist with daily
tasks. However, dynamically-changing human environments requires reactive
motion plans. Reactivity can be accomplished through replanning, e.g.
model-predictive control, or through a reactive feedback policy that modifies
on-going behavior in response to sensory events. In this paper, we investigate
how to use machine learning to add reactivity to a previously learned nominal
skilled behavior. We approach this by learning a reactive modification term for
movement plans represented by nonlinear differential equations. In particular,
we use dynamic movement primitives (DMPs) to represent a skill and a neural
network to learn a reactive policy from human demonstrations. We use the well
explored domain of obstacle avoidance for robot manipulation as a test bed. Our
approach demonstrates how a neural network can be combined with physical
insights to ensure robust behavior across different obstacle settings and
movement durations. Evaluations on an anthropomorphic robotic system
demonstrate the effectiveness of our work.Comment: 8 pages, accepted to be published at ICRA 2017 conferenc
Rich periodic motor skills on humanoid robots: Riding the pedal racer
Just as their discrete counterparts, periodic or rhythmic dynamic motion primitives allow easily modulated and robust motion generation, but for periodic tasks. In this paper we present an approach for modulating periodic dynamic movement primitives based on force feedback, allowing for rich motor behavior and skills. We propose and evaluate the combination of feedback and learned feed-forward terms to fully adapt the motions of a robot in order to achieve a desired force interaction with the environment. For the learning we employ the notion of repetitive control, which can effectively minimize the error of behavior towards a given reference. To demonstrate the approach, we show results of simulated and real world experiments on a compliant humanoid robot COMAN. We show the initial results of utilizing the approach to control a pedal-racer, a demanding balance toy best described as a hybrid between a skateboard and a bicycle. © 2014 IEEE
Supervised Learning and Reinforcement Learning of Feedback Models for Reactive Behaviors: Tactile Feedback Testbed
Robots need to be able to adapt to unexpected changes in the environment such
that they can autonomously succeed in their tasks. However, hand-designing
feedback models for adaptation is tedious, if at all possible, making
data-driven methods a promising alternative. In this paper we introduce a full
framework for learning feedback models for reactive motion planning. Our
pipeline starts by segmenting demonstrations of a complete task into motion
primitives via a semi-automated segmentation algorithm. Then, given additional
demonstrations of successful adaptation behaviors, we learn initial feedback
models through learning from demonstrations. In the final phase, a
sample-efficient reinforcement learning algorithm fine-tunes these feedback
models for novel task settings through few real system interactions. We
evaluate our approach on a real anthropomorphic robot in learning a tactile
feedback task.Comment: Submitted to the International Journal of Robotics Research. Paper
length is 21 pages (including references) with 12 figures. A video overview
of the reinforcement learning experiment on the real robot can be seen at
https://www.youtube.com/watch?v=WDq1rcupVM0. arXiv admin note: text overlap
with arXiv:1710.0855
Coupling Movement Primitives: Interaction With the Environment and Bimanual Tasks
The framework of dynamic movement primitives (DMPs) contains many favorable properties for the execution of robotic trajectories, such as indirect dependence on time, response to perturbations, and the ability to easily modulate the given trajectories, but the framework in its original form remains constrained to the kinematic aspect of the movement. In this paper, we bridge the gap to dynamic behavior by extending the framework with force/torque feedback. We propose and evaluate a modulation approach that allows interaction with objects and the environment. Through the proposed coupling of originally independent robotic trajectories, the approach also enables the execution of bimanual and tightly coupled cooperative tasks. We apply an iterative learning control algorithm to learn a coupling term, which is applied to the original trajectory in a feed-forward fashion and, thus, modifies the trajectory in accordance to the desired positions or external forces. A stability analysis and results of simulated and real-world experiments using two KUKA LWR arms for bimanual tasks and interaction with the environment are presented. By expanding on the framework of DMPs, we keep all the favorable properties, which is demonstrated with temporal modulation and in a two-agent obstacle avoidance task
Using Reinforcement Learning in the tuning of Central Pattern Generators
Dissertação de mestrado em Engenharia InformáticaÉ objetivo deste trabalho aplicar técnicas de Reinforcement Learning em tarefas de
aprendizagem e locomoção de robôs. Reinforcement Learning é uma técnica de
aprendizagem útil no que diz respeito à locomoção de robôs, devido à ênfase que dá à
interação direta entre o agente e o meio ambiente, e ao facto de não exigir supervisão ou
modelos completos, ao contrário do que acontece nas abordagens clássicas. O objetivo
desta técnica consiste na decisão das ações a tomar, de forma a maximizar uma
recompensa cumulativa, tendo em conta o facto de que as decisões podem afetar não só
as recompensas imediatas, como também as futuras.
Neste trabalho será apresentada a estrutura e funcionamento do Reinforcement
Learning e a sua aplicação em Central Pattern Generators, com o objetivo de gerar
locomoção adaptativa otimizada.
De forma a investigar e identificar os pontos fortes e capacidades do Reinforcement
Learning, e para demonstrar de uma forma simples este tipo de algoritmos, foram
implementados dois casos de estudo baseados no estado da arte. No que diz respeito ao
objetivo principal desta tese, duas soluções diferentes foram abordadas: uma primeira
baseada em métodos Natural-Actor Critic, e a segunda, em Cross-Entropy Method. Este
último algoritmo provou ser capaz de lidar com a integração das duas abordagens
propostas. As soluções de integração foram testadas e validadas com recurso ao
simulador Webots e ao modelo do robô DARwIN-OP.In this work, it is intended to apply Reinforcement Learning techniques in tasks involving learning and robot locomotion. Reinforcement Learning is a very useful learning technique with regard to legged robot locomotion, due to its ability to provide direct interaction between the agent and the environment, and the fact of not requiring supervision or complete models, in contrast with other classic approaches. Its aim consists in making decisions about which actions to take so as to maximize a cumulative reward or reinforcement signal, taking into account the fact that the decisions may affect not only the immediate reward, but also the future ones. In this work it will be studied and presented the Reinforcement Learning framework and its application in the tuning of Central Pattern Generators, with the aim of generating optimized robot locomotion.
In order to investigate the strengths and abilities of Reinforcement Learning, and to demonstrate in a simple way the learning process of such algorithms, two case studies were implemented based on the state-of-the-art. With regard to the main purpose of the thesis, two different solutions are addressed: a first one based on Natural-Actor Critic methods, and a second, based on the Cross-Entropy Method. This last algorithm was found to be very capable of handling with the integration of the two proposed approaches. The integration solutions were tested and validated resorting to Webots
simulation and DARwIN-OP robot model