100 research outputs found

    Model-Free δ\delta-Policy Iteration Based on Damped Newton Method for Nonlinear Continuous-Time H∞\infty Tracking Control

    Full text link
    This paper presents a {\delta}-PI algorithm which is based on damped Newton method for the H{\infty} tracking control problem of unknown continuous-time nonlinear system. A discounted performance function and an augmented system are used to get the tracking Hamilton-Jacobi-Isaac (HJI) equation. Tracking HJI equation is a nonlinear partial differential equation, traditional reinforcement learning methods for solving the tracking HJI equation are mostly based on the Newton method, which usually only satisfies local convergence and needs a good initial guess. Based upon the damped Newton iteration operator equation, a generalized tracking Bellman equation is derived firstly. The {\delta}-PI algorithm can seek the optimal solution of the tracking HJI equation by iteratively solving the generalized tracking Bellman equation. On-policy learning and off-policy learning {\delta}-PI reinforcement learning methods are provided, respectively. Off-policy version {\delta}-PI algorithm is a model-free algorithm which can be performed without making use of a priori knowledge of the system dynamics. NN-based implementation scheme for the off-policy {\delta}-PI algorithms is shown. The suitability of the model-free {\delta}-PI algorithm is illustrated with a nonlinear system simulation.Comment: 10 pages, 8 figure

    Multi-H∞ controls for unknown input-interference nonlinear system with reinforcement learning

    Get PDF
    This article studies the multi-H∞ controls for the input-interference nonlinear systems via adaptive dynamic programming (ADP) method, which allows for multiple inputs to have the individual selfish component of the strategy to resist weighted interference. In this line, the ADP scheme is used to learn the Nash-optimization solutions of the input-interference nonlinear system such that multiple H∞ performance indices can reach the defined Nash equilibrium. First, the input-interference nonlinear system is given and the Nash equilibrium is defined. An adaptive neural network (NN) observer is introduced to identify the input-interference nonlinear dynamics. Then, the critic NNs are used to learn the multiple H∞ performance indices. A novel adaptive law is designed to update the critic NN weights by minimizing the Hamiltonian-Jacobi-Isaacs (HJI) equation, which can be used to directly calculate the multi-H∞ controls effectively by using input-output data such that the actor structure is avoided. Moreover, the control system stability and updated parameter convergence are proved. Finally, two numerical examples are simulated to verify the proposed ADP scheme for the input-interference nonlinear system

    Advances in Reinforcement Learning

    Get PDF
    Reinforcement Learning (RL) is a very dynamic area in terms of theory and application. This book brings together many different aspects of the current research on several fields associated to RL which has been growing rapidly, producing a wide variety of learning algorithms for different applications. Based on 24 Chapters, it covers a very broad variety of topics in RL and their application in autonomous systems. A set of chapters in this book provide a general overview of RL while other chapters focus mostly on the applications of RL paradigms: Game Theory, Multi-Agent Theory, Robotic, Networking Technologies, Vehicular Navigation, Medicine and Industrial Logistic

    Cooperative Strategies for Management of Power Quality Problems in Voltage-Source Converter-based Microgrids

    Get PDF
    The development of cooperative control strategies for microgrids has become an area of increasing research interest in recent years, often a result of advances in other areas of control theory such as multi-agent systems and enabled by emerging wireless communications technology, machine learning techniques, and power electronics. While some possible applications of the cooperative control theory to microgrids have been described in the research literature, a comprehensive survey of this approach with respect to its limitations and wide-ranging potential applications has not yet been provided. In this regard, an important area of research into microgrids is developing intelligent cooperative operating strategies within and between microgrids which implement and allocate tasks at the local level, and do not rely on centralized command and control structures. Multi-agent techniques are one focus of this research, but have not been applied to the full range of power quality problems in microgrids. The ability for microgrid control systems to manage harmonics, unbalance, flicker, and black start capability are some examples of applications yet to be fully exploited. During islanded operation, the normal buffer against disturbances and power imbalances provided by the main grid coupling is removed, this together with the reduced inertia of the microgrid (MG), makes power quality (PQ) management a critical control function. This research will investigate new cooperative control techniques for solving power quality problems in voltage source converter (VSC)-based AC microgrids. A set of specific power quality problems have been selected for the application focus, based on a survey of relevant published literature, international standards, and electricity utility regulations. The control problems which will be addressed are voltage regulation, unbalance load sharing, and flicker mitigation. The thesis introduces novel approaches based on multi-agent consensus problems and differential games. It was decided to exclude the management of harmonics, which is a more challenging issue, and is the focus of future research. Rather than using model-based engineering design for optimization of controller parameters, the thesis describes a novel technique for controller synthesis using off-policy reinforcement learning. The thesis also addresses the topic of communication and control system co-design. In this regard, stability of secondary voltage control considering communication time-delays will be addressed, while a performance-oriented approach to rate allocation using a novel solution method is described based on convex optimization

    Active Suppression ofAerofoil Flutter via Neural-Network-Based Adaptive Nonlinear Optimal Control

    Get PDF
    This thesis deals with active flutter suppression (AFS) on aerofoils via adaptive nonlinear optimal control using neural networks (NNs). Aeroelastic flutter can damage aerofoils if not properly controlled. AFS not only ensures flutter-free flight but also enables the use of aerodynamically more efficient lightweight aerofoils. However, existing optimal controllers for AFS are generally susceptible to modelling errors while other controllers less prone to uncertainties do not provide optimal control. This thesis, thus, aims to reduce the impact of the dilemma by proposing new solutions based on nonlinear optimal control online synthesis (NOCOS) according to online updated dynamics. Existing NOCOS methods, with NNs as essential elements, require a separate initial stabilising control law for the overall system, an additional stabilising tuning loop for the actor NN, or an additional stabilising term in the critic NN tuning law, to guarantee the closed-loop stability for unstable and marginally stable systems. The resulting complexity is undesired in AFS applications due to computational concerns in real-time implementation. Moreover, the existing NOCOS methods are confined to locally nonlinear systems, while aeroelastic systems under consideration are globally nonlinear. These make all the existing NOCOS algorithms inapplicable to AFS without modification and improvement. Therefore, this thesis solves the aforementioned problems through the following step-by-step approaches. Firstly, a four degrees-of-freedom (4-DOF) aeroelastic model is considered, where leading- and trailing-edge control surfaces of the aerofoil are used to actively suppress flutter. Accordingly, a virtual stiffness-damping system (VSDS) is developed to simulate physical stiffness in the aeroelastic system. The VSDS, together with a scaled-down typical aerofoil section placed in a wind tunnel, serve as an experimental 4-DOF aeroelastic test-bed for synthesis and validation of proposed AFS controllers that follow. Secondly, a Modified form of NN-based Value Function Approximation (MVFA), tuned by gradient-descent learning, is proposed for NOCOS to address the closedloop stability in a compact controller configuration suitable for real-time implementation. Its validity and efficacy are examined by the Lyapunov stability analysis and numerical studies. Thirdly, a systematic procedure based on linear matrix inequalities is further proposed for synthesising a scheduled parameter matrix to generalise the MVFA to to globally nonlinear cases, so that the new NN controller suits AFS applications. In addition, the extended Kalman filter (EKF) is proposed for the new NN controller for fast parameter convergence. An identifier NN is also derived to capture and update aeroelastic dynamics in real time to mitigate the impact of modelling errors. Wind-tunnel experiments were conducted for validation. Finally, a non-quadratic functional is introduced to generalise the performance index to tackle the problem where control inputs are constrained. The feasibility of including the non-quadratic cost function under the proposed control scheme based on the MVFA is examined via the Lyapunov stability analysis and was also experimentally evaluated through the wind-tunnel testings. The proposed NN controllers are compact in structure and shown capable of maintaining the closed-loop stability while eliminating the need for a separate initial stabilising control law for the overall system, an additional tuning loop for the actor NN, and an additional stabilising term in the critic NN tuning law. Under the new control schemes, online synthesised nonlinear control laws are optimal in the cases with and without constraints in control. Comparisons drawn with a popular linear-parameter-varying (LPV) controller in the form of the widely used linear quadratic regulator (LQR) in experiments show that the proposed NN controllers outperform the LPV-LQR algorithm and improve AFS from the optimal control perspective. Specifically, the proposed NN controllers can effectively mitigate the impact of modelling errors, successfully solving the mentioned dilemma involved in AFS. The results also confirm that the proposed NN controllers are suitable for real-time implementation.Thesis (Ph.D.) -- University of Adelaide, School of Mechanical Engineering, 201
    • …
    corecore