4,464 research outputs found

    Actor-Critic Reinforcement Learning for Control with Stability Guarantee

    Full text link
    Reinforcement Learning (RL) and its integration with deep learning have achieved impressive performance in various robotic control tasks, ranging from motion planning and navigation to end-to-end visual manipulation. However, stability is not guaranteed in model-free RL by solely using data. From a control-theoretic perspective, stability is the most important property for any control system, since it is closely related to safety, robustness, and reliability of robotic systems. In this paper, we propose an actor-critic RL framework for control which can guarantee closed-loop stability by employing the classic Lyapunov's method in control theory. First of all, a data-based stability theorem is proposed for stochastic nonlinear systems modeled by Markov decision process. Then we show that the stability condition could be exploited as the critic in the actor-critic RL to learn a controller/policy. At last, the effectiveness of our approach is evaluated on several well-known 3-dimensional robot control tasks and a synthetic biology gene network tracking task in three different popular physics simulation platforms. As an empirical evaluation on the advantage of stability, we show that the learned policies can enable the systems to recover to the equilibrium or way-points when interfered by uncertainties such as system parametric variations and external disturbances to a certain extent.Comment: IEEE RA-L + IROS 202

    Mathematical problems for complex networks

    Get PDF
    Copyright @ 2012 Zidong Wang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. This article is made available through the Brunel Open Access Publishing Fund.Complex networks do exist in our lives. The brain is a neural network. The global economy is a network of national economies. Computer viruses routinely spread through the Internet. Food-webs, ecosystems, and metabolic pathways can be represented by networks. Energy is distributed through transportation networks in living organisms, man-made infrastructures, and other physical systems. Dynamic behaviors of complex networks, such as stability, periodic oscillation, bifurcation, or even chaos, are ubiquitous in the real world and often reconfigurable. Networks have been studied in the context of dynamical systems in a range of disciplines. However, until recently there has been relatively little work that treats dynamics as a function of network structure, where the states of both the nodes and the edges can change, and the topology of the network itself often evolves in time. Some major problems have not been fully investigated, such as the behavior of stability, synchronization and chaos control for complex networks, as well as their applications in, for example, communication and bioinformatics

    Stability analysis of a noise-induced Hopf bifurcation

    Get PDF
    We study analytically and numerically the noise-induced transition between an absorbing and an oscillatory state in a Duffing oscillator subject to multiplicative, Gaussian white noise. We show in a non-perturbative manner that a stochastic bifurcation occurs when the Lyapunov exponent of the linearised system becomes positive. We deduce from a simple formula for the Lyapunov exponent the phase diagram of the stochastic Duffing oscillator. The behaviour of physical observables, such as the oscillator's mean energy, is studied both close to and far from the bifurcation.Comment: 10 pages, 8 figure

    Hamiltonian and Brownian systems with long-range interactions

    Full text link
    We discuss the dynamics and thermodynamics of systems with long-range interactions. We contrast the microcanonical description of an isolated Hamiltonian system to the canonical description of a stochastically forced Brownian system. We show that the mean-field approximation is exact in a proper thermodynamic limit. The equilibrium distribution function is solution of an integrodifferential equation obtained from a static BBGKY-like hierarchy. It also optimizes a thermodynamical potential (entropy or free energy) under appropriate constraints. We discuss the kinetic theory of these systems. In the N+N\to +\infty limit, a Hamiltonian system is described by the Vlasov equation. To order 1/N, the collision term of a homogeneous system has the form of the Lenard-Balescu operator. It reduces to the Landau operator when collective effects are neglected. We also consider the motion of a test particle in a bath of field particles and derive the general form of the Fokker-Planck equation. The diffusion coefficient is anisotropic and depends on the velocity of the test particle. This can lead to anomalous diffusion. For Brownian systems, in the N+N\to +\infty limit, the kinetic equation is a non-local Kramers equation. In the strong friction limit ξ+\xi\to +\infty, or for large times tξ1t\gg \xi^{-1}, it reduces to a non-local Smoluchowski equation. We give explicit results for self-gravitating systems, two-dimensional vortices and for the HMF model. We also introduce a generalized class of stochastic processes and derive the corresponding generalized Fokker-Planck equations. We discuss how a notion of generalized thermodynamics can emerge in complex systems displaying anomalous diffusion.Comment: The original paper has been split in two parts with some new material and correction

    Stochastically Resilient Observer Design for a Class of Continuous-Time Nonlinear Systems

    Get PDF
    This work addresses the design of stochastically resilient or non-fragile continuous-time Luenberger observers for systems with incrementally conic nonlinearities. Such designs maintain the convergence and/or performance when the observer gain is erroneously implemented due possibly to computational errors i.e. round off errors in computing the observer gain or changes in the observer parameters during operation. The error in the observer gain is modeled as a random process and a common linear matrix inequality formulation is presented to address the stochastically resilient observer design problem for a variety of performance criteria. Numerical examples are given to illustrate the theoretical results

    Robust passivity and passification of stochastic fuzzy time-delay systems

    Get PDF
    The official published version can be obtained from the link below.In this paper, the passivity and passification problems are investigated for a class of uncertain stochastic fuzzy systems with time-varying delays. The fuzzy system is based on the Takagi–Sugeno (T–S) model that is often used to represent the complex nonlinear systems in terms of fuzzy sets and fuzzy reasoning. To reflect more realistic dynamical behaviors of the system, both the parameter uncertainties and the stochastic disturbances are considered, where the parameter uncertainties enter into all the system matrices and the stochastic disturbances are given in the form of a Brownian motion. We first propose the definition of robust passivity in the sense of expectation. Then, by utilizing the Lyapunov functional method, the Itô differential rule and the matrix analysis techniques, we establish several sufficient criteria such that, for all admissible parameter uncertainties and stochastic disturbances, the closed-loop stochastic fuzzy time-delay system is robustly passive in the sense of expectation. The derived criteria, which are either delay-independent or delay-dependent, are expressed in terms of linear matrix inequalities (LMIs) that can be easily checked by using the standard numerical software. Illustrative examples are presented to demonstrate the effectiveness and usefulness of the proposed results.This work was supported by the Teaching and Research Fund for Excellent Young Teachers at Southeast University of China, the Specialized Research Fund for the Doctoral Program of Higher Education for New Teachers 200802861044, the National Natural Science Foundation of China under Grant 60804028 and the Royal Society of the United Kingdom

    A survey on gain-scheduled control and filtering for parameter-varying systems

    Get PDF
    Copyright © 2014 Guoliang Wei et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.This paper presents an overview of the recent developments in the gain-scheduled control and filtering problems for the parameter-varying systems. First of all, we recall several important algorithms suitable for gain-scheduling method including gain-scheduled proportional-integral derivative (PID) control, H 2, H ∞ and mixed H 2 / H ∞ gain-scheduling methods as well as fuzzy gain-scheduling techniques. Secondly, various important parameter-varying system models are reviewed, for which gain-scheduled control and filtering issues are usually dealt with. In particular, in view of the randomly occurring phenomena with time-varying probability distributions, some results of our recent work based on the probability-dependent gain-scheduling methods are reviewed. Furthermore, some latest progress in this area is discussed. Finally, conclusions are drawn and several potential future research directions are outlined.The National Natural Science Foundation of China under Grants 61074016, 61374039, 61304010, and 61329301; the Natural Science Foundation of Jiangsu Province of China under Grant BK20130766; the Program for Professor of Special Appointment (Eastern Scholar) at Shanghai Institutions of Higher Learning; the Program for New Century Excellent Talents in University under Grant NCET-11-1051, the Leverhulme Trust of the U.K., the Alexander von Humboldt Foundation of Germany
    corecore