Search CORE

7,050 research outputs found

Actor-Critic Reinforcement Learning for Control with Stability Guarantee

Author: Han Minghao
Pan Wei
Wang Jun
Zhang Lixian
Publication venue
Publication date: 15/07/2020
Field of study

Reinforcement Learning (RL) and its integration with deep learning have achieved impressive performance in various robotic control tasks, ranging from motion planning and navigation to end-to-end visual manipulation. However, stability is not guaranteed in model-free RL by solely using data. From a control-theoretic perspective, stability is the most important property for any control system, since it is closely related to safety, robustness, and reliability of robotic systems. In this paper, we propose an actor-critic RL framework for control which can guarantee closed-loop stability by employing the classic Lyapunov's method in control theory. First of all, a data-based stability theorem is proposed for stochastic nonlinear systems modeled by Markov decision process. Then we show that the stability condition could be exploited as the critic in the actor-critic RL to learn a controller/policy. At last, the effectiveness of our approach is evaluated on several well-known 3-dimensional robot control tasks and a synthetic biology gene network tracking task in three different popular physics simulation platforms. As an empirical evaluation on the advantage of stability, we show that the learned policies can enable the systems to recover to the equilibrium or way-points when interfered by uncertainties such as system parametric variations and external disturbances to a certain extent.Comment: IEEE RA-L + IROS 202

arXiv.org e-Print Archive

UCL Discovery

The University of Manchester - Institutional Repository

Visual Closed-Loop Control for Pouring Liquids

Author: Fox Dieter
Schenck Connor
Publication venue
Publication date: 25/02/2017
Field of study

Pouring a specific amount of liquid is a challenging task. In this paper we develop methods for robots to use visual feedback to perform closed-loop control for pouring liquids. We propose both a model-based and a model-free method utilizing deep learning for estimating the volume of liquid in a container. Our results show that the model-free method is better able to estimate the volume. We combine this with a simple PID controller to pour specific amounts of liquid, and show that the robot is able to achieve an average 38ml deviation from the target amount. To our knowledge, this is the first use of raw visual feedback to pour liquids in robotics.Comment: To appear at ICRA 201

arXiv.org e-Print Archive

Crossref

A Developmental Organization for Robot Behavior

Author: Grupen Roderic A.
Publication venue: Lund University Cognitive Studies
Publication date: 01/01/2003
Field of study

This paper focuses on exploring how learning and development can be structured in synthetic (robot) systems. We present a developmental assembler for constructing reusable and temporally extended actions in a sequence. The discussion adopts the traditions of dynamic pattern theory in which behavior is an artifact of coupled dynamical systems with a number of controllable degrees of freedom. In our model, the events that delineate control decisions are derived from the pattern of (dis)equilibria on a working subset of sensorimotor policies. We show how this architecture can be used to accomplish sequential knowledge gathering and representation tasks and provide examples of the kind of developmental milestones that this approach has already produced in our lab

CiteSeerX

The 1990 progress report and future plans

Author: Compton Michael
Friedland Peter
Zweben Monte
Publication venue
Publication date
Field of study

This document describes the progress and plans of the Artificial Intelligence Research Branch (RIA) at ARC in 1990. Activities span a range from basic scientific research to engineering development and to fielded NASA applications, particularly those applications that are enabled by basic research carried out at RIA. Work is conducted in-house and through collaborative partners in academia and industry. Our major focus is on a limited number of research themes with a dual commitment to technical excellence and proven applicability to NASA short, medium, and long-term problems. RIA acts as the Agency's lead organization for research aspects of artificial intelligence, working closely with a second research laboratory at JPL and AI applications groups at all NASA centers

NASA Technical Reports Server

Morphological properties of mass-spring networks for optimal locomotion learning

Author: Carette Benonie
Dambre Joni
Degrave Jonas
Urbain Gabriel
wyffels Francis
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2017
Field of study

Robots have proven very useful in automating industrial processes. Their rigid components and powerful actuators, however, render them unsafe or unfit to work in normal human environments such as schools or hospitals. Robots made of compliant, softer materials may offer a valid alternative. Yet, the dynamics of these compliant robots are much more complicated compared to normal rigid robots of which all components can be accurately controlled. It is often claimed that, by using the concept of morphological computation, the dynamical complexity can become a strength. On the one hand, the use of flexible materials can lead to higher power efficiency and more fluent and robust motions. On the other hand, using embodiment in a closed-loop controller, part of the control task itself can be outsourced to the body dynamics. This can significantly simplify the additional resources required for locomotion control. To this goal, a first step consists in an exploration of the trade-offs between morphology, efficiency of locomotion, and the ability of a mechanical body to serve as a computational resource. In this work, we use a detailed dynamical model of a Mass–Spring–Damper (MSD) network to study these trade-offs. We first investigate the influence of the network size and compliance on locomotion quality and energy efficiency by optimizing an external open-loop controller using evolutionary algorithms. We find that larger networks can lead to more stable gaits and that the system’s optimal compliance to maximize the traveled distance is directly linked to the desired frequency of locomotion. In the last set of experiments, the suitability of MSD bodies for being used in a closed loop is also investigated. Since maximally efficient actuator signals are clearly related to the natural body dynamics, in a sense, the body is tailored for the task of contributing to its own control. Using the same simulation platform, we therefore study how the network states can be successfully used to create a feedback signal and how its accuracy is linked to the body size

Crossref

Ghent University Academic Bibliography

Frontiers - Publisher Connector

PubMed Central

Visual control of flight speed in Drosophila melanogaster

Author: Dickinson Michael H.
Fry Steven N.
Rohrseitz Nicola
Straw Andrew D.
Publication venue: 'The Company of Biologists'
Publication date: 01/03/2009
Field of study

Flight control in insects depends on self-induced image motion (optic flow), which the visual system must process to generate appropriate corrective steering maneuvers. Classic experiments in tethered insects applied rigorous system identification techniques for the analysis of turning reactions in the presence of rotating pattern stimuli delivered in open-loop. However, the functional relevance of these measurements for visual free-flight control remains equivocal due to the largely unknown effects of the highly constrained experimental conditions. To perform a systems analysis of the visual flight speed response under free-flight conditions, we implemented a `one-parameter open-loop' paradigm using `TrackFly' in a wind tunnel equipped with real-time tracking and virtual reality display technology. Upwind flying flies were stimulated with sine gratings of varying temporal and spatial frequencies, and the resulting speed responses were measured from the resulting flight speed reactions. To control flight speed, the visual system of the fruit fly extracts linear pattern velocity robustly over a broad range of spatio–temporal frequencies. The speed signal is used for a proportional control of flight speed within locomotor limits. The extraction of pattern velocity over a broad spatio–temporal frequency range may require more sophisticated motion processing mechanisms than those identified in flies so far. In Drosophila, the neuromotor pathways underlying flight speed control may be suitably explored by applying advanced genetic techniques, for which our data can serve as a baseline. Finally, the high-level control principles identified in the fly can be meaningfully transferred into a robotic context, such as for the robust and efficient control of autonomous flying micro air vehicles

Caltech Authors

ZORA