1,602 research outputs found

    Online optimal and adaptive integral tracking control for varying discrete‐time systems using reinforcement learning

    Get PDF
    Conventional closed‐form solution to the optimal control problem using optimal control theory is only available under the assumption that there are known system dynamics/models described as differential equations. Without such models, reinforcement learning (RL) as a candidate technique has been successfully applied to iteratively solve the optimal control problem for unknown or varying systems. For the optimal tracking control problem, existing RL techniques in the literature assume either the use of a predetermined feedforward input for the tracking control, restrictive assumptions on the reference model dynamics, or discounted tracking costs. Furthermore, by using discounted tracking costs, zero steady‐state error cannot be guaranteed by the existing RL methods. This article therefore presents an optimal online RL tracking control framework for discrete‐time (DT) systems, which does not impose any restrictive assumptions of the existing methods and equally guarantees zero steady‐state tracking error. This is achieved by augmenting the original system dynamics with the integral of the error between the reference inputs and the tracked outputs for use in the online RL framework. It is further shown that the resulting value function for the DT linear quadratic tracker using the augmented formulation with integral control is also quadratic. This enables the development of Bellman equations, which use only the system measurements to solve the corresponding DT algebraic Riccati equation and obtain the optimal tracking control inputs online. Two RL strategies are thereafter proposed based on both the value function approximation and the Q‐learning along with bounds on excitation for the convergence of the parameter estimates. Simulation case studies show the effectiveness of the proposed approach

    Recent advances on recursive filtering and sliding mode design for networked nonlinear stochastic systems: A survey

    Get PDF
    Copyright © 2013 Jun Hu et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.Some recent advances on the recursive filtering and sliding mode design problems for nonlinear stochastic systems with network-induced phenomena are surveyed. The network-induced phenomena under consideration mainly include missing measurements, fading measurements, signal quantization, probabilistic sensor delays, sensor saturations, randomly occurring nonlinearities, and randomly occurring uncertainties. With respect to these network-induced phenomena, the developments on filtering and sliding mode design problems are systematically reviewed. In particular, concerning the network-induced phenomena, some recent results on the recursive filtering for time-varying nonlinear stochastic systems and sliding mode design for time-invariant nonlinear stochastic systems are given, respectively. Finally, conclusions are proposed and some potential future research works are pointed out.This work was supported in part by the National Natural Science Foundation of China under Grant nos. 61134009, 61329301, 61333012, 61374127 and 11301118, the Engineering and Physical Sciences Research Council (EPSRC) of the UK under Grant no. GR/S27658/01, the Royal Society of the UK, and the Alexander von Humboldt Foundation of Germany

    Automation and Control Architecture for Hybrid Pipeline Robots

    Get PDF
    The aim of this research project, towards the automation of the Hybrid Pipeline Robot (HPR), is the development of a control architecture and strategy, based on reconfiguration of the control strategy for speed-controlled pipeline operations and self-recovering action, while performing energy and time management. The HPR is a turbine powered pipeline device where the flow energy is converted to mechanical energy for traction of the crawler vehicle. Thus, the device is flow dependent, compromising the autonomy, and the range of tasks it can perform. The control strategy proposes pipeline operations supervised by a speed control, while optimizing the energy, solved as a multi-objective optimization problem. The states of robot cruising and self recovering, are controlled by solving a neuro-dynamic programming algorithm for energy and time optimization, The robust operation of the robot includes a self-recovering state either after completion of the mission, or as a result of failures leading to the loss of the robot inside the pipeline, and to guaranteeing the HPR autonomy and operations even under adverse pipeline conditions Two of the proposed models, system identification and tracking system, based on Artificial Neural Networks, have been simulated with trial data. Despite the satisfactory results, it is necessary to measure a full set of robot’s parameters for simulating the complete control strategy. To solve the problem, an instrumentation system, consisting on a set of probes and a signal conditioning board, was designed and developed, customized for the HPR’s mechanical and environmental constraints. As a result, the contribution of this research project to the Hybrid Pipeline Robot is to add the capabilities of energy management, for improving the vehicle autonomy, increasing the distances the device can travel inside the pipelines; the speed control for broadening the range of operations; and the self-recovery capability for improving the reliability of the device in pipeline operations, lowering the risk of potential loss of the robot inside the pipeline, causing the degradation of pipeline performance. All that means the pipeline robot can target new market sectors that before were prohibitive

    Optimal Control of Unknown Nonlinear System From Inputoutput Data

    Get PDF
    Optimal control designers usually require a plant model to design a controller. The problem is the controller\u27s performance heavily depends on the accuracy of the plant model. However, in many situations, it is very time-consuming to implement the system identification procedure and an accurate structure of a plant model is very difficult to obtain. On the other hand, neuro-fuzzy models with product inference engine, singleton fuzzifier, center average defuzzifier, and Gaussian membership functions can be easily trained by many well-established learning algorithms based on given input-output data pairs. Therefore, this kind of model is used in the current optimal controller design. Two approaches of designing optimal controllers of unknown nonlinear systems based on neuro-fuzzy models are presented in the thesis. The first approach first utilizes neuro-fuzzy models to approximate the unknown nonlinear systems, and then the feasible-direction algorithm is used to achieve the numerical solution of the Euler-Lagrange equations of the formulated optimal control problem. This algorithm uses the steepest descent to find the search direction and then apply a one-dimensional search routine to find the best step length. Finally several nonlinear optimal control problems are simulated and the results show that the performance of the proposed approach is quite similar to that of optimal control to the system represented by an explicit mathematical model. However, due to the limitation of the feasible-direction algorithm, this method cannot be applied to highly nonlinear and dimensional plants. Therefore, another approach that can overcome these drawbacks is proposed. This method utilizes Takagi-Sugeno (TS) fuzzy models to design the optimal controller. TS fuzzy models are first derived from the direct linearization of the neuro-fuzzy models, which is close to the local linearization of the nonlinear dynamic systems. The operating points are chosen so that the TS fuzzy model is a good approximation of the neuro-fuzzy model. Based on the TS fuzzy model, the optimal control is implemented for a nonlinear two-link flexible robot and a rigid asymmetric spacecraft, thus providing the possibility of implementing the well-established optimal control method on unknown nonlinear dynamic systems
    corecore