6,974 research outputs found

    Model-based Reinforcement Learning with Parametrized Physical Models and Optimism-Driven Exploration

    Full text link
    In this paper, we present a robotic model-based reinforcement learning method that combines ideas from model identification and model predictive control. We use a feature-based representation of the dynamics that allows the dynamics model to be fitted with a simple least squares procedure, and the features are identified from a high-level specification of the robot's morphology, consisting of the number and connectivity structure of its links. Model predictive control is then used to choose the actions under an optimistic model of the dynamics, which produces an efficient and goal-directed exploration strategy. We present real time experimental results on standard benchmark problems involving the pendulum, cartpole, and double pendulum systems. Experiments indicate that our method is able to learn a range of benchmark tasks substantially faster than the previous best methods. To evaluate our approach on a realistic robotic control task, we also demonstrate real time control of a simulated 7 degree of freedom arm.Comment: 8 page

    Actor-Critic Reinforcement Learning for Control with Stability Guarantee

    Full text link
    Reinforcement Learning (RL) and its integration with deep learning have achieved impressive performance in various robotic control tasks, ranging from motion planning and navigation to end-to-end visual manipulation. However, stability is not guaranteed in model-free RL by solely using data. From a control-theoretic perspective, stability is the most important property for any control system, since it is closely related to safety, robustness, and reliability of robotic systems. In this paper, we propose an actor-critic RL framework for control which can guarantee closed-loop stability by employing the classic Lyapunov's method in control theory. First of all, a data-based stability theorem is proposed for stochastic nonlinear systems modeled by Markov decision process. Then we show that the stability condition could be exploited as the critic in the actor-critic RL to learn a controller/policy. At last, the effectiveness of our approach is evaluated on several well-known 3-dimensional robot control tasks and a synthetic biology gene network tracking task in three different popular physics simulation platforms. As an empirical evaluation on the advantage of stability, we show that the learned policies can enable the systems to recover to the equilibrium or way-points when interfered by uncertainties such as system parametric variations and external disturbances to a certain extent.Comment: IEEE RA-L + IROS 202

    Neural network control of a rehabilitation robot by state and output feedback

    Get PDF
    In this paper, neural network control is presented for a rehabilitation robot with unknown system dynamics. To deal with the system uncertainties and improve the system robustness, adaptive neural networks are used to approximate the unknown model of the robot and adapt interactions between the robot and the patient. Both full state feedback control and output feedback control are considered in this paper. With the proposed control, uniform ultimate boundedness of the closed loop system is achieved in the context of Lyapunov’s stability theory and its associated techniques. The state of the system is proven to converge to a small neighborhood of zero by appropriately choosing design parameters. Extensive simulations for a rehabilitation robot with constraints are carried out to illustrate the effectiveness of the proposed control

    Admittance-based controller design for physical human-robot interaction in the constrained task space

    Get PDF
    In this article, an admittance-based controller for physical human-robot interaction (pHRI) is presented to perform the coordinated operation in the constrained task space. An admittance model and a soft saturation function are employed to generate a differentiable reference trajectory to ensure that the end-effector motion of the manipulator complies with the human operation and avoids collision with surroundings. Then, an adaptive neural network (NN) controller involving integral barrier Lyapunov function (IBLF) is designed to deal with tracking issues. Meanwhile, the controller can guarantee the end-effector of the manipulator limited in the constrained task space. A learning method based on the radial basis function NN (RBFNN) is involved in controller design to compensate for the dynamic uncertainties and improve tracking performance. The IBLF method is provided to prevent violations of the constrained task space. We prove that all states of the closed-loop system are semiglobally uniformly ultimately bounded (SGUUB) by utilizing the Lyapunov stability principles. At last, the effectiveness of the proposed algorithm is verified on a Baxter robot experiment platform. Note to Practitioners-This work is motivated by the neglect of safety in existing controller design in physical human-robot interaction (pHRI), which exists in industry and services, such as assembly and medical care. It is considerably required in the controller design for rigorously handling constraints. Therefore, in this article, we propose a novel admittance-based human-robot interaction controller. The developed controller has the following functionalities: 1) ensuring reference trajectory remaining in the constrained task space: A differentiable reference trajectory is shaped by the desired admittance model and a soft saturation function; 2) solving uncertainties of robotic dynamics: A learning approach based on radial basis function neural network (RBFNN) is involved in controller design; and 3) ensuring the end-effector of the manipulator remaining in the constrained task space: different from other barrier Lyapunov function (BLF), integral BLF (IBLF) is proposed to constrain system output directly rather than tracking error, which may be more convenient for controller designers. The controller can be potentially applied in many areas. First, it can be used in the rehabilitation robot to avoid injuring the patient by limiting the motion. Second, it can ensure the end-effector of the industrial manipulator in a prescribed task region. In some industrial tasks, dangerous or damageable tools are mounted on the end-effector, and it will hurt humans and bring damage to the robot when the end-effector is out of the prescribed task region. Third, it may bring a new idea to the designed controller for avoiding collisions in pHRI when collisions occur in the prescribed trajectory of end-effector
    corecore