405 research outputs found

    Model learning for trajectory tracking of robot manipulators

    Get PDF
    Abstract Model based controllers have drastically improved robot performance, increasing task accuracy while reducing control effort. Nevertheless, all this was realized with a very strong assumption: the exact knowledge of the physical properties of both the robot and the environment that surrounds it. This assertion is often misleading: in fact modern robots are modeled in a very approximate way and, more important, the environment is almost never static and completely known. Also for systems very simple, such as robot manipulators, these assumptions are still too strong and must be relaxed. Many methods were developed which, exploiting previous experiences, are able to refine the nominal model: from classic identification techniques to more modern machine learning based approaches. Indeed, the topic of this thesis is the investigation of these data driven techniques in the context of robot control for trajectory tracking. In the first two chapters, preliminary knowledge is provided on both model based controllers, used in robotics to assure precise trajectory tracking, and model learning techniques. In the following three chapters, are presented the novelties introduced by the author in this context with respect to the state of the art: three works with the same premise (an inaccurate system modeling), an identical goal (accurate trajectory tracking control) but with small differences according to the specific platform of application (fully actuated, underactuated, redundant robots). In all the considered architectures, an online learning scheme has been introduced to correct the nominal feedback linearization control law. Indeed, the method has been primarily introduced in the literature to cope with fully actuated systems, showing its efficacy in the accurate tracking of joint space trajectories also with an inaccurate dynamic model. The main novelty of the technique was the use of only kinematics information, instead of torque measurements (in general very noisy), to online retrieve and compensate the dynamic mismatches. After that the method has been extended to underactuated robots. This new architecture was composed by an online learning correction of the controller, acting on the actuated part of the system (the nominal partial feedback linearization), and an offline planning phase, required to realize a dynamically feasible trajectory also for the zero dynamics of the system. The scheme was iterative: after each trial, according to the collected information, both the phases were improved and then repeated until the task achievement. Also in this case the method showed its capability, both in numerical simulations and on real experiments on a robotics platform. Eventually the method has been applied to redundant systems: differently from before, in this context the task consisted in the accurate tracking of a Cartesian end effector trajectory. In principle very similar to the fully actuated case, the presence of redundancy slowed down drastically the learning machinery convergence, worsening the performance. In order to cope with this, a redundancy resolution was proposed that, exploiting an approximation of the learning algorithm (Gaussian process regression), allowed to locally maximize the information and so select the most convenient self motion for the system; moreover, all of this was realized with just the resolution of a quadratic programming problem. Also in this case the method showed its performance, realizing an accurate online tracking while reducing both the control effort and the joints velocity, obtaining so a natural behaviour. The thesis concludes with summary considerations on the proposed approach and with possible future directions of research

    Nonlinear UGV Identification Methods via the Gaussian Process Regression Model for Control System Design

    Get PDF
    In this paper, two identification methods are proposed for a ground robotic system. A Gaussian process regression (GPR) model is presented and adopted for a system identification framework. Its performance and features were compared with a wavelet-based nonlinear autoregressive exogenous (NARX) model. Both algorithms were compared and experimentally validated for a small ground robot. Moreover, data were collected throughout the onboard sensors. The results show better prediction performance in the case of the GPR method, as an estimation algorithm and in providing a measure of uncertainty

    Information driven self-organization of complex robotic behaviors

    Get PDF
    Information theory is a powerful tool to express principles to drive autonomous systems because it is domain invariant and allows for an intuitive interpretation. This paper studies the use of the predictive information (PI), also called excess entropy or effective measure complexity, of the sensorimotor process as a driving force to generate behavior. We study nonlinear and nonstationary systems and introduce the time-local predicting information (TiPI) which allows us to derive exact results together with explicit update rules for the parameters of the controller in the dynamical systems framework. In this way the information principle, formulated at the level of behavior, is translated to the dynamics of the synapses. We underpin our results with a number of case studies with high-dimensional robotic systems. We show the spontaneous cooperativity in a complex physical system with decentralized control. Moreover, a jointly controlled humanoid robot develops a high behavioral variety depending on its physics and the environment it is dynamically embedded into. The behavior can be decomposed into a succession of low-dimensional modes that increasingly explore the behavior space. This is a promising way to avoid the curse of dimensionality which hinders learning systems to scale well.Comment: 29 pages, 12 figure

    Using Parameterized Black-Box Priors to Scale Up Model-Based Policy Search for Robotics

    Get PDF
    The most data-efficient algorithms for reinforcement learning in robotics are model-based policy search algorithms, which alternate between learning a dynamical model of the robot and optimizing a policy to maximize the expected return given the model and its uncertainties. Among the few proposed approaches, the recently introduced Black-DROPS algorithm exploits a black-box optimization algorithm to achieve both high data-efficiency and good computation times when several cores are used; nevertheless, like all model-based policy search approaches, Black-DROPS does not scale to high dimensional state/action spaces. In this paper, we introduce a new model learning procedure in Black-DROPS that leverages parameterized black-box priors to (1) scale up to high-dimensional systems, and (2) be robust to large inaccuracies of the prior information. We demonstrate the effectiveness of our approach with the "pendubot" swing-up task in simulation and with a physical hexapod robot (48D state space, 18D action space) that has to walk forward as fast as possible. The results show that our new algorithm is more data-efficient than previous model-based policy search algorithms (with and without priors) and that it can allow a physical 6-legged robot to learn new gaits in only 16 to 30 seconds of interaction time.Comment: Accepted at ICRA 2018; 8 pages, 4 figures, 2 algorithms, 1 table; Video at https://youtu.be/HFkZkhGGzTo ; Spotlight ICRA presentation at https://youtu.be/_MZYDhfWeL

    Learning robot in-hand manipulation with tactile features

    Get PDF
    Dexterous manipulation enables repositioning of objects and tools within a robot’s hand. When applying dexterous manipulation to unknown objects, exact object models are not available. Instead of relying on models, compliance and tactile feedback can be exploited to adapt to unknown objects. However, compliant hands and tactile sensors add complexity and are themselves difficult to model. Hence, we propose acquiring in-hand manipulation skills through reinforcement learning, which does not require analytic dynamics or kinematics models. In this paper, we show that this approach successfully acquires a tactile manipulation skill using a passively compliant hand. Additionally, we show that the learned tactile skill generalizes to novel objects

    Learning Dynamic Robot-to-Human Object Handover from Human Feedback

    Full text link
    Object handover is a basic, but essential capability for robots interacting with humans in many applications, e.g., caring for the elderly and assisting workers in manufacturing workshops. It appears deceptively simple, as humans perform object handover almost flawlessly. The success of humans, however, belies the complexity of object handover as collaborative physical interaction between two agents with limited communication. This paper presents a learning algorithm for dynamic object handover, for example, when a robot hands over water bottles to marathon runners passing by the water station. We formulate the problem as contextual policy search, in which the robot learns object handover by interacting with the human. A key challenge here is to learn the latent reward of the handover task under noisy human feedback. Preliminary experiments show that the robot learns to hand over a water bottle naturally and that it adapts to the dynamics of human motion. One challenge for the future is to combine the model-free learning algorithm with a model-based planning approach and enable the robot to adapt over human preferences and object characteristics, such as shape, weight, and surface texture.Comment: Appears in the Proceedings of the International Symposium on Robotics Research (ISRR) 201

    Empowerment for Continuous Agent-Environment Systems

    Full text link
    This paper develops generalizations of empowerment to continuous states. Empowerment is a recently introduced information-theoretic quantity motivated by hypotheses about the efficiency of the sensorimotor loop in biological organisms, but also from considerations stemming from curiosity-driven learning. Empowemerment measures, for agent-environment systems with stochastic transitions, how much influence an agent has on its environment, but only that influence that can be sensed by the agent sensors. It is an information-theoretic generalization of joint controllability (influence on environment) and observability (measurement by sensors) of the environment by the agent, both controllability and observability being usually defined in control theory as the dimensionality of the control/observation spaces. Earlier work has shown that empowerment has various interesting and relevant properties, e.g., it allows us to identify salient states using only the dynamics, and it can act as intrinsic reward without requiring an external reward. However, in this previous work empowerment was limited to the case of small-scale and discrete domains and furthermore state transition probabilities were assumed to be known. The goal of this paper is to extend empowerment to the significantly more important and relevant case of continuous vector-valued state spaces and initially unknown state transition probabilities. The continuous state space is addressed by Monte-Carlo approximation; the unknown transitions are addressed by model learning and prediction for which we apply Gaussian processes regression with iterated forecasting. In a number of well-known continuous control tasks we examine the dynamics induced by empowerment and include an application to exploration and online model learning

    Visuo-Haptic Grasping of Unknown Objects through Exploration and Learning on Humanoid Robots

    Get PDF
    Die vorliegende Arbeit befasst sich mit dem Greifen unbekannter Objekte durch humanoide Roboter. Dazu werden visuelle Informationen mit haptischer Exploration kombiniert, um Greifhypothesen zu erzeugen. Basierend auf simulierten Trainingsdaten wird außerdem eine Greifmetrik gelernt, welche die Erfolgswahrscheinlichkeit der Greifhypothesen bewertet und die mit der größten geschätzten Erfolgswahrscheinlichkeit auswählt. Diese wird verwendet, um Objekte mit Hilfe einer reaktiven Kontrollstrategie zu greifen. Die zwei Kernbeiträge der Arbeit sind zum einen die haptische Exploration von unbekannten Objekten und zum anderen das Greifen von unbekannten Objekten mit Hilfe einer neuartigen datengetriebenen Greifmetrik
    • …
    corecore