405 research outputs found
Model learning for trajectory tracking of robot manipulators
Abstract
Model based controllers have drastically improved robot performance, increasing task accuracy while reducing control effort. Nevertheless, all this was realized with a very strong assumption: the exact knowledge of the physical properties of both the robot and the environment that surrounds it. This assertion is often misleading: in fact modern robots are modeled in a very approximate way and, more important, the environment is almost never static and completely known. Also for systems very simple, such as robot manipulators, these assumptions are still too strong and must be relaxed. Many methods were developed which, exploiting previous experiences, are able to refine the nominal model: from classic identification techniques to more modern machine learning based approaches. Indeed, the topic of this thesis is the investigation of these data driven techniques in the context of robot control for trajectory tracking. In the first two chapters, preliminary knowledge is provided on both model based controllers, used in robotics to assure precise trajectory tracking, and model learning techniques. In the following three chapters, are presented the novelties introduced by the author in this context with respect to the state of the art: three works with the same premise (an inaccurate system modeling), an identical goal (accurate trajectory tracking control) but with small differences according to the specific platform of application (fully actuated, underactuated, redundant robots). In all the considered architectures, an online learning scheme has been introduced to correct the nominal feedback linearization control law. Indeed, the method has been primarily introduced in the literature to cope with fully actuated systems, showing its efficacy in the accurate tracking of joint space trajectories also with an inaccurate dynamic model. The main novelty of the technique was the use of only
kinematics information, instead of torque measurements (in general very noisy), to online retrieve and compensate the dynamic mismatches. After that the method has
been extended to underactuated robots. This new architecture was composed by an online learning correction of the controller, acting on the actuated part of the system
(the nominal partial feedback linearization), and an offline planning phase, required to realize a dynamically feasible trajectory also for the zero dynamics of the system.
The scheme was iterative: after each trial, according to the collected information, both the phases were improved and then repeated until the task achievement. Also in this case the method showed its capability, both in numerical simulations and on real experiments on a robotics platform. Eventually the method has been applied to redundant systems: differently from before, in this context the task consisted in the accurate tracking of a Cartesian end effector trajectory. In principle very similar to the fully actuated case, the presence of redundancy slowed down drastically the learning machinery convergence, worsening the performance. In order to cope with this, a redundancy resolution was proposed that, exploiting an approximation of the learning algorithm (Gaussian process regression), allowed to locally maximize the information and so select the most convenient self motion for the system; moreover, all of this was realized with just the resolution of a quadratic programming problem. Also in this case the method showed its performance, realizing an accurate online tracking while reducing both the control effort and the joints velocity, obtaining so a natural behaviour. The thesis concludes with summary considerations on the proposed approach and with possible future directions of research
Nonlinear UGV Identification Methods via the Gaussian Process Regression Model for Control System Design
In this paper, two identification methods are proposed for a ground robotic system. A
Gaussian process regression (GPR) model is presented and adopted for a system identification framework. Its performance and features were compared with a wavelet-based nonlinear autoregressive
exogenous (NARX) model. Both algorithms were compared and experimentally validated for a small
ground robot. Moreover, data were collected throughout the onboard sensors. The results show better
prediction performance in the case of the GPR method, as an estimation algorithm and in providing a
measure of uncertainty
Information driven self-organization of complex robotic behaviors
Information theory is a powerful tool to express principles to drive
autonomous systems because it is domain invariant and allows for an intuitive
interpretation. This paper studies the use of the predictive information (PI),
also called excess entropy or effective measure complexity, of the sensorimotor
process as a driving force to generate behavior. We study nonlinear and
nonstationary systems and introduce the time-local predicting information
(TiPI) which allows us to derive exact results together with explicit update
rules for the parameters of the controller in the dynamical systems framework.
In this way the information principle, formulated at the level of behavior, is
translated to the dynamics of the synapses. We underpin our results with a
number of case studies with high-dimensional robotic systems. We show the
spontaneous cooperativity in a complex physical system with decentralized
control. Moreover, a jointly controlled humanoid robot develops a high
behavioral variety depending on its physics and the environment it is
dynamically embedded into. The behavior can be decomposed into a succession of
low-dimensional modes that increasingly explore the behavior space. This is a
promising way to avoid the curse of dimensionality which hinders learning
systems to scale well.Comment: 29 pages, 12 figure
Using Parameterized Black-Box Priors to Scale Up Model-Based Policy Search for Robotics
The most data-efficient algorithms for reinforcement learning in robotics are
model-based policy search algorithms, which alternate between learning a
dynamical model of the robot and optimizing a policy to maximize the expected
return given the model and its uncertainties. Among the few proposed
approaches, the recently introduced Black-DROPS algorithm exploits a black-box
optimization algorithm to achieve both high data-efficiency and good
computation times when several cores are used; nevertheless, like all
model-based policy search approaches, Black-DROPS does not scale to high
dimensional state/action spaces. In this paper, we introduce a new model
learning procedure in Black-DROPS that leverages parameterized black-box priors
to (1) scale up to high-dimensional systems, and (2) be robust to large
inaccuracies of the prior information. We demonstrate the effectiveness of our
approach with the "pendubot" swing-up task in simulation and with a physical
hexapod robot (48D state space, 18D action space) that has to walk forward as
fast as possible. The results show that our new algorithm is more
data-efficient than previous model-based policy search algorithms (with and
without priors) and that it can allow a physical 6-legged robot to learn new
gaits in only 16 to 30 seconds of interaction time.Comment: Accepted at ICRA 2018; 8 pages, 4 figures, 2 algorithms, 1 table;
Video at https://youtu.be/HFkZkhGGzTo ; Spotlight ICRA presentation at
https://youtu.be/_MZYDhfWeL
Learning robot in-hand manipulation with tactile features
Dexterous manipulation enables repositioning of
objects and tools within a robot’s hand. When applying dexterous
manipulation to unknown objects, exact object models
are not available. Instead of relying on models, compliance and
tactile feedback can be exploited to adapt to unknown objects.
However, compliant hands and tactile sensors add complexity
and are themselves difficult to model. Hence, we propose acquiring
in-hand manipulation skills through reinforcement learning,
which does not require analytic dynamics or kinematics models.
In this paper, we show that this approach successfully acquires
a tactile manipulation skill using a passively compliant hand.
Additionally, we show that the learned tactile skill generalizes
to novel objects
Learning Dynamic Robot-to-Human Object Handover from Human Feedback
Object handover is a basic, but essential capability for robots interacting
with humans in many applications, e.g., caring for the elderly and assisting
workers in manufacturing workshops. It appears deceptively simple, as humans
perform object handover almost flawlessly. The success of humans, however,
belies the complexity of object handover as collaborative physical interaction
between two agents with limited communication. This paper presents a learning
algorithm for dynamic object handover, for example, when a robot hands over
water bottles to marathon runners passing by the water station. We formulate
the problem as contextual policy search, in which the robot learns object
handover by interacting with the human. A key challenge here is to learn the
latent reward of the handover task under noisy human feedback. Preliminary
experiments show that the robot learns to hand over a water bottle naturally
and that it adapts to the dynamics of human motion. One challenge for the
future is to combine the model-free learning algorithm with a model-based
planning approach and enable the robot to adapt over human preferences and
object characteristics, such as shape, weight, and surface texture.Comment: Appears in the Proceedings of the International Symposium on Robotics
Research (ISRR) 201
Empowerment for Continuous Agent-Environment Systems
This paper develops generalizations of empowerment to continuous states.
Empowerment is a recently introduced information-theoretic quantity motivated
by hypotheses about the efficiency of the sensorimotor loop in biological
organisms, but also from considerations stemming from curiosity-driven
learning. Empowemerment measures, for agent-environment systems with stochastic
transitions, how much influence an agent has on its environment, but only that
influence that can be sensed by the agent sensors. It is an
information-theoretic generalization of joint controllability (influence on
environment) and observability (measurement by sensors) of the environment by
the agent, both controllability and observability being usually defined in
control theory as the dimensionality of the control/observation spaces. Earlier
work has shown that empowerment has various interesting and relevant
properties, e.g., it allows us to identify salient states using only the
dynamics, and it can act as intrinsic reward without requiring an external
reward. However, in this previous work empowerment was limited to the case of
small-scale and discrete domains and furthermore state transition probabilities
were assumed to be known. The goal of this paper is to extend empowerment to
the significantly more important and relevant case of continuous vector-valued
state spaces and initially unknown state transition probabilities. The
continuous state space is addressed by Monte-Carlo approximation; the unknown
transitions are addressed by model learning and prediction for which we apply
Gaussian processes regression with iterated forecasting. In a number of
well-known continuous control tasks we examine the dynamics induced by
empowerment and include an application to exploration and online model
learning
Visuo-Haptic Grasping of Unknown Objects through Exploration and Learning on Humanoid Robots
Die vorliegende Arbeit befasst sich mit dem Greifen unbekannter Objekte durch humanoide Roboter. Dazu werden visuelle Informationen mit haptischer Exploration kombiniert, um Greifhypothesen zu erzeugen. Basierend auf simulierten Trainingsdaten wird außerdem eine Greifmetrik gelernt, welche die Erfolgswahrscheinlichkeit der Greifhypothesen bewertet und die mit der größten geschätzten Erfolgswahrscheinlichkeit auswählt. Diese wird verwendet, um Objekte mit Hilfe einer reaktiven Kontrollstrategie zu greifen. Die zwei Kernbeiträge der Arbeit sind zum einen die haptische Exploration von unbekannten Objekten und zum anderen das Greifen von unbekannten Objekten mit Hilfe einer neuartigen datengetriebenen Greifmetrik
- …