515 research outputs found
Data-driven Economic NMPC using Reinforcement Learning
Reinforcement Learning (RL) is a powerful tool to perform data-driven optimal
control without relying on a model of the system. However, RL struggles to
provide hard guarantees on the behavior of the resulting control scheme. In
contrast, Nonlinear Model Predictive Control (NMPC) and Economic NMPC (ENMPC)
are standard tools for the closed-loop optimal control of complex systems with
constraints and limitations, and benefit from a rich theory to assess their
closed-loop behavior. Unfortunately, the performance of (E)NMPC hinges on the
quality of the model underlying the control scheme. In this paper, we show that
an (E)NMPC scheme can be tuned to deliver the optimal policy of the real system
even when using a wrong model. This result also holds for real systems having
stochastic dynamics. This entails that ENMPC can be used as a new type of
function approximator within RL. Furthermore, we investigate our results in the
context of ENMPC and formally connect them to the concept of dissipativity,
which is central for the ENMPC stability. Finally, we detail how these results
can be used to deploy classic RL tools for tuning (E)NMPC schemes. We apply
these tools on both a classical linear MPC setting and a standard nonlinear
example from the ENMPC literature
On the Minimization of Maximum Transient Energy Growth.
The problem of minimizing the maximum transient energy growth is considered.
This problem has importance in some fluid flow control problems and other
classes of nonlinear systems. Conditions for the existence of static controllers
that ensure strict dissipativity of the transient energy are established and an
explicit parametrization of all such controllers is provided. It also is shown
that by means of a Q-parametrization, the problem of minimizing the maximum
transient energy growth can be posed as a convex optimization problem that can
be solved by means of a Ritz approximation of the free parameter. By considering
the transient energy growth at an appropriate sequence of discrete time points,
the minimal maximum transient energy growth problem can be posed as a
semidefinite program. The theoretical developments are demonstrated on a
numerical example
A general dissipativity constraint for feedback system design, with emphasis on MPC
A ‘General Dissipativity Constraint’ (GDC) is introduced to facilitate the design of stable feedback systems. A primary application is to MPC controllers when it is preferred to avoid the use of ‘stabilising ingredients’ such as terminal constraint sets or long prediction horizons. Some very general convergence results are proved under mild conditions. The use of quadratic functions, replacing GDC by ‘Quadratic Dissipation Constraint’ (QDC), is introduced to allow implementation using linear matrix inequalities. The use of QDC is illustrated for several scenarios: state feedback for a linear time-invariant system, MPC of a linear system, MPC of an input-affine system, and MPC with persistent disturbances. The stability that is guaranteed by GDC is weaker than Lyapunov stability, being ‘Lagrange stability plus convergence’. Input-to-state stability is obtained if the control law is continuous in the state. An example involving an open-loop unstable helicopter illustrates the efficacy of the approach in practice.National Research Foundation Singapor
Dissipative Imitation Learning for Discrete Dynamic Output Feedback Control with Sparse Data Sets
Imitation learning enables the synthesis of controllers for complex
objectives and highly uncertain plant models. However, methods to provide
stability guarantees to imitation learned controllers often rely on large
amounts of data and/or known plant models. In this paper, we explore an
input-output (IO) stability approach to dissipative imitation learning, which
achieves stability with sparse data sets and with little known about the plant
model. A closed-loop stable dynamic output feedback controller is learned using
expert data, a coarse IO plant model, and a new constraint to enforce
dissipativity on the learned controller. While the learning objective is
nonconvex, iterative convex overbounding (ICO) and projected gradient descent
(PGD) are explored as methods to successfully learn the controller. This new
imitation learning method is applied to two unknown plants and compared to
traditionally learned dynamic output feedback controller and neural network
controller. With little knowledge of the plant model and a small data set, the
dissipativity constrained learned controller achieves closed loop stability and
successfully mimics the behavior of the expert controller, while other methods
often fail to maintain stability and achieve good performance
Relaxed dissipativity assumptions and a simplified algorithm for multiobjective MPC
We consider nonlinear model predictive control (MPC) with multiple competing
cost functions. In each step of the scheme, a multiobjective optimal control
problem with a nonlinear system and terminal conditions is solved. We propose
an algorithm and give performance guarantees for the resulting MPC closed loop
system. Thereby, we significantly simplify the assumptions made in the
literature so far by assuming strict dissipativity and the existence of a
compatible terminal cost for one of the competing objective functions only. We
give conditions which ensure asymptotic stability of the closed loop and, what
is more, obtain performance estimates for all cost criteria. Numerical
simulations on various instances illustrate our findings. The proposed
algorithm requires the selection of an efficient solution in each iteration,
thus we examine several selection rules and their impact on the results
- …