4,139 research outputs found

    Optimal control of nonlinear partially-unknown systems with unsymmetrical input constraints and its applications to the optimal UAV circumnavigation problem

    Full text link
    Aimed at solving the optimal control problem for nonlinear systems with unsymmetrical input constraints, we present an online adaptive approach for partially unknown control systems/dynamics. The designed algorithm converges online to the optimal control solution without the knowledge of the internal system dynamics. The optimality of the obtained control policy and the stability for the closed-loop dynamic optimality are proved theoretically. The proposed method greatly relaxes the assumption on the form of the internal dynamics and input constraints in previous works. Besides, the control design framework proposed in this paper offers a new approach to solve the optimal circumnavigation problem involving a moving target for a fixed-wing unmanned aerial vehicle (UAV). The control performance of our method is compared with that of the existing circumnavigation control law in a numerical simulation and the simulation results validate the effectiveness of our algorithm

    Online optimal and adaptive integral tracking control for varying discrete‐time systems using reinforcement learning

    Get PDF
    Conventional closed‐form solution to the optimal control problem using optimal control theory is only available under the assumption that there are known system dynamics/models described as differential equations. Without such models, reinforcement learning (RL) as a candidate technique has been successfully applied to iteratively solve the optimal control problem for unknown or varying systems. For the optimal tracking control problem, existing RL techniques in the literature assume either the use of a predetermined feedforward input for the tracking control, restrictive assumptions on the reference model dynamics, or discounted tracking costs. Furthermore, by using discounted tracking costs, zero steady‐state error cannot be guaranteed by the existing RL methods. This article therefore presents an optimal online RL tracking control framework for discrete‐time (DT) systems, which does not impose any restrictive assumptions of the existing methods and equally guarantees zero steady‐state tracking error. This is achieved by augmenting the original system dynamics with the integral of the error between the reference inputs and the tracked outputs for use in the online RL framework. It is further shown that the resulting value function for the DT linear quadratic tracker using the augmented formulation with integral control is also quadratic. This enables the development of Bellman equations, which use only the system measurements to solve the corresponding DT algebraic Riccati equation and obtain the optimal tracking control inputs online. Two RL strategies are thereafter proposed based on both the value function approximation and the Q‐learning along with bounds on excitation for the convergence of the parameter estimates. Simulation case studies show the effectiveness of the proposed approach

    Continual Reinforcement Learning Formulation For Zero-Sum Game-Based Constrained Optimal Tracking

    Get PDF
    This study provides a novel reinforcement learning-based optimal tracking control of partially uncertain nonlinear discrete-time (DT) systems with state constraints using zero-sum game (ZSG) formulation. To address optimal tracking, a novel augmented system consisting of tracking error and its integral value, along with an uncertain desired trajectory, is constructed. A barrier function (BF) with a tradeoff factor is incorporated into the cost function to keep the state trajectories to remain within a compact set and to balance safety with optimality. Next, by using the modified value functional, the ZSG formulation is introduced wherein an actor–critic neural network (NN) framework is employed to approximate the value functional, optimal control input, and worst disturbance. The critic NN weights are tuned once at the sample instants and then iteratively within sampling instants. Using control input errors, the actor NN weights are adjusted once a sampling instant. The concurrent learning term in the critic weight tuning law overcomes the need for the persistency excitation (PE) condition. Further, a weight consolidation scheme is incorporated into the critic update law to attain lifelong learning by overcoming catastrophic forgetting. Finally, a numerical example supports the analytical claims

    Output-feedback online optimal control for a class of nonlinear systems

    Full text link
    In this paper an output-feedback model-based reinforcement learning (MBRL) method for a class of second-order nonlinear systems is developed. The control technique uses exact model knowledge and integrates a dynamic state estimator within the model-based reinforcement learning framework to achieve output-feedback MBRL. Simulation results demonstrate the efficacy of the developed method
    corecore