Search CORE

12,141 research outputs found

Online optimal and adaptive integral tracking control for varying discrete‐time systems using reinforcement learning

Author: Bertsekas DP
Levine WS
Lewis FL
Sutton RS
Werbos PJ
Åström KJ
Publication venue: 'Wiley'
Publication date: 16/04/2020
Field of study

Conventional closed‐form solution to the optimal control problem using optimal control theory is only available under the assumption that there are known system dynamics/models described as differential equations. Without such models, reinforcement learning (RL) as a candidate technique has been successfully applied to iteratively solve the optimal control problem for unknown or varying systems. For the optimal tracking control problem, existing RL techniques in the literature assume either the use of a predetermined feedforward input for the tracking control, restrictive assumptions on the reference model dynamics, or discounted tracking costs. Furthermore, by using discounted tracking costs, zero steady‐state error cannot be guaranteed by the existing RL methods. This article therefore presents an optimal online RL tracking control framework for discrete‐time (DT) systems, which does not impose any restrictive assumptions of the existing methods and equally guarantees zero steady‐state tracking error. This is achieved by augmenting the original system dynamics with the integral of the error between the reference inputs and the tracked outputs for use in the online RL framework. It is further shown that the resulting value function for the DT linear quadratic tracker using the augmented formulation with integral control is also quadratic. This enables the development of Bellman equations, which use only the system measurements to solve the corresponding DT algebraic Riccati equation and obtain the optimal tracking control inputs online. Two RL strategies are thereafter proposed based on both the value function approximation and the Q‐learning along with bounds on excitation for the convergence of the parameter estimates. Simulation case studies show the effectiveness of the proposed approach

Crossref

White Rose Research Online

Model-based and model-free learning strategies for wet clutch control

Author: De Keyser Robain
Depraetere Bruno
Dutta Abhishek
Ionescu Clara-Mihaela
Nowe Ann
Pinte Gregory
Swevers Jan
Van Vaerenbergh Kevin
Wyns Bart
Zhong Yu
Publication venue: 'Elsevier BV'
Publication date: 01/01/2014
Field of study

Ghent University Academic Bibliography

Optimal Network Control in Partially-Controllable Networks

Author: Liang Qingkai
Modiano Eytan
Publication venue
Publication date: 06/01/2019
Field of study

The effectiveness of many optimal network control algorithms (e.g., BackPressure) relies on the premise that all of the nodes are fully controllable. However, these algorithms may yield poor performance in a partially-controllable network where a subset of nodes are uncontrollable and use some unknown policy. Such a partially-controllable model is of increasing importance in real-world networked systems such as overlay-underlay networks. In this paper, we design optimal network control algorithms that can stabilize a partially-controllable network. We first study the scenario where uncontrollable nodes use a queue-agnostic policy, and propose a low-complexity throughput-optimal algorithm, called Tracking-MaxWeight (TMW), which enhances the original MaxWeight algorithm with an explicit learning of the policy used by uncontrollable nodes. Next, we investigate the scenario where uncontrollable nodes use a queue-dependent policy and the problem is formulated as an MDP with unknown queueing dynamics. We propose a new reinforcement learning algorithm, called Truncated Upper Confidence Reinforcement Learning (TUCRL), and prove that TUCRL achieves tunable three-way tradeoffs between throughput, delay and convergence rate

arXiv.org e-Print Archive

DSpace@MIT

Crossref