27 research outputs found

    A Practical and Conceptual Framework for Learning in Control

    No full text
    We propose a fully Bayesian approach for efficient reinforcement learning (RL) in Markov decision processes with continuous-valued state and action spaces when no expert knowledge is available. Our framework is based on well-established ideas from statistics and machine learning and learns fast since it carefully models, quantifies, and incorporates available knowledge when making decisions. The key ingredient of our framework is a probabilistic model, which is implemented using a Gaussian process (GP), a distribution over functions. In the context of dynamic systems, the GP models the transition function. By considering all plausible transition functions simultaneously, we reduce model bias, a problem that frequently occurs when deterministic models are used. Due to its generality and efficiency, our RL framework can be considered a conceptual and practical approach to learning models and controllers whe

    Model learning for trajectory tracking of robot manipulators

    Get PDF
    Abstract Model based controllers have drastically improved robot performance, increasing task accuracy while reducing control effort. Nevertheless, all this was realized with a very strong assumption: the exact knowledge of the physical properties of both the robot and the environment that surrounds it. This assertion is often misleading: in fact modern robots are modeled in a very approximate way and, more important, the environment is almost never static and completely known. Also for systems very simple, such as robot manipulators, these assumptions are still too strong and must be relaxed. Many methods were developed which, exploiting previous experiences, are able to refine the nominal model: from classic identification techniques to more modern machine learning based approaches. Indeed, the topic of this thesis is the investigation of these data driven techniques in the context of robot control for trajectory tracking. In the first two chapters, preliminary knowledge is provided on both model based controllers, used in robotics to assure precise trajectory tracking, and model learning techniques. In the following three chapters, are presented the novelties introduced by the author in this context with respect to the state of the art: three works with the same premise (an inaccurate system modeling), an identical goal (accurate trajectory tracking control) but with small differences according to the specific platform of application (fully actuated, underactuated, redundant robots). In all the considered architectures, an online learning scheme has been introduced to correct the nominal feedback linearization control law. Indeed, the method has been primarily introduced in the literature to cope with fully actuated systems, showing its efficacy in the accurate tracking of joint space trajectories also with an inaccurate dynamic model. The main novelty of the technique was the use of only kinematics information, instead of torque measurements (in general very noisy), to online retrieve and compensate the dynamic mismatches. After that the method has been extended to underactuated robots. This new architecture was composed by an online learning correction of the controller, acting on the actuated part of the system (the nominal partial feedback linearization), and an offline planning phase, required to realize a dynamically feasible trajectory also for the zero dynamics of the system. The scheme was iterative: after each trial, according to the collected information, both the phases were improved and then repeated until the task achievement. Also in this case the method showed its capability, both in numerical simulations and on real experiments on a robotics platform. Eventually the method has been applied to redundant systems: differently from before, in this context the task consisted in the accurate tracking of a Cartesian end effector trajectory. In principle very similar to the fully actuated case, the presence of redundancy slowed down drastically the learning machinery convergence, worsening the performance. In order to cope with this, a redundancy resolution was proposed that, exploiting an approximation of the learning algorithm (Gaussian process regression), allowed to locally maximize the information and so select the most convenient self motion for the system; moreover, all of this was realized with just the resolution of a quadratic programming problem. Also in this case the method showed its performance, realizing an accurate online tracking while reducing both the control effort and the joints velocity, obtaining so a natural behaviour. The thesis concludes with summary considerations on the proposed approach and with possible future directions of research

    Efficient Reinforcement Learning using Gaussian Processes

    Get PDF
    This book examines Gaussian processes (GPs) in model-based reinforcement learning (RL) and inference in nonlinear dynamic systems. First, we introduce PILCO, a fully Bayesian approach for efficient RL in continuous-valued state and action spaces when no expert knowledge is available. PILCO learns fast since it takes model uncertainties consistently into account during long-term planning and decision making. Thus, it reduces model bias, a common problem in model-based RL. Due to its generality and efficiency, PILCO is a conceptual and practical approach to jointly learning models and controllers fully automatically. Across all tasks, we report an unprecedented degree of automation and an unprecedented speed of learning. Second, we propose principled algorithms for robust filtering and smoothing in GP dynamic systems. Our methods are based on analytic moment matching and clearly advance state-of-the-art methods

    Nonlinear control of underactuated mechanical systems with application to robotics and aerospace vehicles

    Get PDF
    Thesis (Ph.D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2001.Includes bibliographical references (leaves 308-316).This thesis is devoted to nonlinear control, reduction, and classification of underactuated mechanical systems. Underactuated systems are mechanical control systems with fewer controls than the number of configuration variables. Control of underactuated systems is currently an active field of research due to their broad applications in Robotics, Aerospace Vehicles, and Marine Vehicles. The examples of underactuated systems include flexible-link robots, nobile robots, walking robots, robots on mobile platforms, cars, locomotive systems, snake-type and swimming robots, acrobatic robots, aircraft, spacecraft, helicopters, satellites, surface vessels, and underwater vehicles. Based on recent surveys, control of general underactuated systems is a major open problem. Almost all real-life mechanical systems possess kinetic symmetry properties, i.e. their kinetic energy does not depend on a subset of configuration variables called external variables. In this work, I exploit such symmetry properties as a means of reducing the complexity of control design for underactuated systems. As a result, reduction and nonlinear control of high-order underactuated systems with kinetic symmetry is the main focus of this thesis. By "reduction", we mean a procedure to reduce control design for the original underactuated system to control of a lowerorder nonlinear or mechanical system. One way to achieve such a reduction is by transforming an underactuated system to a cascade nonlinear system with structural properties. If all underactuated systems in a class can be transformed into a specific class of nonlinear systems, we refer to the transformed systems as the "normal form" of the corresponding class of underactuated systems. Our main contribution is to find explicit change of coordinates and control that transform several classes of underactuated systems, which appear in robotics and aerospace applications, into cascade nonlinear systems with structural properties that are convenient for control design purposes. The obtained cascade normal forms are three classes of nonlinear systems, namely, systems in strict feedback form, feedforward form, and nontriangular linear-quadratic form. The names of these three classes are due to the particular lower-triangular, upper-triangular, and nontriangular structure in which the state variables appear in the dynamics of the corresponding nonlinear systems. The triangular normal forms of underactuated systems can be controlled using existing backstepping and feedforwarding procedures. However, control of the nontriangular normal forms is a major open problem. We address this problem for important classes of nontriangular systems of interest by introducing a new stabilization method based on the solutions of fixed-point equations as stabilizing nonlinear state feedback laws. This controller is obtained via a simple recursive method that is convenient for implementation. For special classes of nontriangular nonlinear systems, such fixed-point equations can be solved explicitly ...by Reza Olfati-Saber.Ph.D

    09181 Abstracts Collection -- Sampling-based Optimization in the Presence of Uncertainty

    Get PDF
    This Dagstuhl seminar brought together researchers from statistical ranking and selection; experimental design and response-surface modeling; stochastic programming; approximate dynamic programming; optimal learning; and the design and analysis of computer experiments with the goal of attaining a much better mutual understanding of the commonalities and differences of the various approaches to sampling-based optimization, and to take first steps toward an overarching theory, encompassing many of the topics above

    Dynamic balancing of underactuated robots

    Get PDF
    This thesis presents the control of planar underactuated systems that have one less control input than the number of degrees of freedom. The underactuated robots are studied to achieve dynamically stable motions commonly encountered during robot locomotion. This work emphasizes the relation between the underactuated systems and biped locomotion and builds on the previous works in the literature on underactuated robot locomotion. Two planar system models are treated: an acrobatic robot and a compass biped with torso. The dynamic stability of fast periodic trajectories of these systems are regulated by designing asymptotically stable feedback controllers. The resulting internal dynamics of the systems are analyzed and shaped to achieve energy efficiency and robustness of the closed-loop system trajectories. In particular, Bézier polynomial approximations and parameter optimization methods are used to systematically construct the internal dynamics of the systems. Simulation results are presented for dynamically stable orbits of the acrobatic robot and the compass biped with torso

    Sliding Mode Control

    Get PDF
    The main objective of this monograph is to present a broad range of well worked out, recent application studies as well as theoretical contributions in the field of sliding mode control system analysis and design. The contributions presented here include new theoretical developments as well as successful applications of variable structure controllers primarily in the field of power electronics, electric drives and motion steering systems. They enrich the current state of the art, and motivate and encourage new ideas and solutions in the sliding mode control area

    Advanced Strategies for Robot Manipulators

    Get PDF
    Amongst the robotic systems, robot manipulators have proven themselves to be of increasing importance and are widely adopted to substitute for human in repetitive and/or hazardous tasks. Modern manipulators are designed complicatedly and need to do more precise, crucial and critical tasks. So, the simple traditional control methods cannot be efficient, and advanced control strategies with considering special constraints are needed to establish. In spite of the fact that groundbreaking researches have been carried out in this realm until now, there are still many novel aspects which have to be explored
    corecore