4 research outputs found
Combining reinforcement learning and optimal control for the control of nonlinear dynamical systems
This thesis presents a novel hierarchical learning framework, Reinforcement Learning Optimal Control,
for controlling nonlinear dynamical systems with continuous states and actions. The adapted approach
mimics the neural computations that allow our brain to bridge across the divide between symbolic
action-selection and low-level actuation control by operating at two levels of abstraction. First, current
findings demonstrate that at the level of limb coordination human behaviour is explained by linear
optimal feedback control theory, where cost functions match energy and timing constraints of tasks.
Second, humans learn cognitive tasks involving learning symbolic level action selection, in terms of
both model-free and model-based reinforcement learning algorithms. We postulate that the ease with
which humans learn complex nonlinear tasks arises from combining these two levels of abstraction.
The Reinforcement Learning Optimal Control framework learns the local task dynamics from naive
experience using an expectation maximization algorithm for estimation of linear dynamical systems
and forms locally optimal Linear Quadratic Regulators, producing continuous low-level control. A
high-level reinforcement learning agent uses these available controllers as actions and learns how to
combine them in state space, while maximizing a long term reward. The optimal control costs form
training signals for high-level symbolic learner. The algorithm demonstrates that a small number of
locally optimal linear controllers can be combined in a smart way to solve global nonlinear control
problems and forms a proof-of-principle to how the brain may bridge the divide between low-level
continuous control and high-level symbolic action selection. It competes in terms of computational
cost and solution quality with state-of-the-art control, which is illustrated with solutions to benchmark
problems.Open Acces