12,268 research outputs found

    Actor-Critic Reinforcement Learning for Control with Stability Guarantee

    Full text link
    Reinforcement Learning (RL) and its integration with deep learning have achieved impressive performance in various robotic control tasks, ranging from motion planning and navigation to end-to-end visual manipulation. However, stability is not guaranteed in model-free RL by solely using data. From a control-theoretic perspective, stability is the most important property for any control system, since it is closely related to safety, robustness, and reliability of robotic systems. In this paper, we propose an actor-critic RL framework for control which can guarantee closed-loop stability by employing the classic Lyapunov's method in control theory. First of all, a data-based stability theorem is proposed for stochastic nonlinear systems modeled by Markov decision process. Then we show that the stability condition could be exploited as the critic in the actor-critic RL to learn a controller/policy. At last, the effectiveness of our approach is evaluated on several well-known 3-dimensional robot control tasks and a synthetic biology gene network tracking task in three different popular physics simulation platforms. As an empirical evaluation on the advantage of stability, we show that the learned policies can enable the systems to recover to the equilibrium or way-points when interfered by uncertainties such as system parametric variations and external disturbances to a certain extent.Comment: IEEE RA-L + IROS 202

    Recent advances on filtering and control for nonlinear stochastic complex systems with incomplete information: A survey

    Get PDF
    This Article is provided by the Brunel Open Access Publishing Fund - Copyright @ 2012 Hindawi PublishingSome recent advances on the filtering and control problems for nonlinear stochastic complex systems with incomplete information are surveyed. The incomplete information under consideration mainly includes missing measurements, randomly varying sensor delays, signal quantization, sensor saturations, and signal sampling. With such incomplete information, the developments on various filtering and control issues are reviewed in great detail. In particular, the addressed nonlinear stochastic complex systems are so comprehensive that they include conventional nonlinear stochastic systems, different kinds of complex networks, and a large class of sensor networks. The corresponding filtering and control technologies for such nonlinear stochastic complex systems are then discussed. Subsequently, some latest results on the filtering and control problems for the complex systems with incomplete information are given. Finally, conclusions are drawn and several possible future research directions are pointed out.This work was supported in part by the National Natural Science Foundation of China under Grant nos. 61134009, 61104125, 61028008, 61174136, 60974030, and 61074129, the Qing Lan Project of Jiangsu Province of China, the Project sponsored by SRF for ROCS of SEM of China, the Engineering and Physical Sciences Research Council EPSRC of the UK under Grant GR/S27658/01, the Royal Society of the UK, and the Alexander von Humboldt Foundation of Germany

    Future capacity growth of energy technologies: are scenarios consistent with historical evidence?

    Get PDF
    Future scenarios of the energy system under greenhouse gas emission constraints depict dramatic growth in a range of energy technologies. Technological growth dynamics observed historically provide a useful comparator for these future trajectories. We find that historical time series data reveal a consistent relationship between how much a technology’s cumulative installed capacity grows, and how long this growth takes. This relationship between extent (how much) and duration (for how long) is consistent across both energy supply and end-use technologies, and both established and emerging technologies. We then develop and test an approach for using this historical relationship to assess technological trajectories in future scenarios. Our approach for “learning from the past” contributes to the assessment and verification of integrated assessment and energy-economic models used to generate quantitative scenarios. Using data on power generation technologies from two such models, we also find a consistent extent - duration relationship across both technologies and scenarios. This relationship describes future low carbon technological growth in the power sector which appears to be conservative relative to what has been evidenced historically. Specifically, future extents of capacity growth are comparatively low given the lengthy time duration of that growth. We treat this finding with caution due to the low number of data points. Yet it remains counter-intuitive given the extremely rapid growth rates of certain low carbon technologies under stringent emission constraints. We explore possible reasons for the apparent scenario conservatism, and find parametric or structural conservatism in the underlying models to be one possible explanation
    corecore