515 research outputs found

    Data-driven Economic NMPC using Reinforcement Learning

    Get PDF
    Reinforcement Learning (RL) is a powerful tool to perform data-driven optimal control without relying on a model of the system. However, RL struggles to provide hard guarantees on the behavior of the resulting control scheme. In contrast, Nonlinear Model Predictive Control (NMPC) and Economic NMPC (ENMPC) are standard tools for the closed-loop optimal control of complex systems with constraints and limitations, and benefit from a rich theory to assess their closed-loop behavior. Unfortunately, the performance of (E)NMPC hinges on the quality of the model underlying the control scheme. In this paper, we show that an (E)NMPC scheme can be tuned to deliver the optimal policy of the real system even when using a wrong model. This result also holds for real systems having stochastic dynamics. This entails that ENMPC can be used as a new type of function approximator within RL. Furthermore, we investigate our results in the context of ENMPC and formally connect them to the concept of dissipativity, which is central for the ENMPC stability. Finally, we detail how these results can be used to deploy classic RL tools for tuning (E)NMPC schemes. We apply these tools on both a classical linear MPC setting and a standard nonlinear example from the ENMPC literature

    On the Minimization of Maximum Transient Energy Growth.

    Get PDF
    The problem of minimizing the maximum transient energy growth is considered. This problem has importance in some fluid flow control problems and other classes of nonlinear systems. Conditions for the existence of static controllers that ensure strict dissipativity of the transient energy are established and an explicit parametrization of all such controllers is provided. It also is shown that by means of a Q-parametrization, the problem of minimizing the maximum transient energy growth can be posed as a convex optimization problem that can be solved by means of a Ritz approximation of the free parameter. By considering the transient energy growth at an appropriate sequence of discrete time points, the minimal maximum transient energy growth problem can be posed as a semidefinite program. The theoretical developments are demonstrated on a numerical example

    A general dissipativity constraint for feedback system design, with emphasis on MPC

    Get PDF
    A ‘General Dissipativity Constraint’ (GDC) is introduced to facilitate the design of stable feedback systems. A primary application is to MPC controllers when it is preferred to avoid the use of ‘stabilising ingredients’ such as terminal constraint sets or long prediction horizons. Some very general convergence results are proved under mild conditions. The use of quadratic functions, replacing GDC by ‘Quadratic Dissipation Constraint’ (QDC), is introduced to allow implementation using linear matrix inequalities. The use of QDC is illustrated for several scenarios: state feedback for a linear time-invariant system, MPC of a linear system, MPC of an input-affine system, and MPC with persistent disturbances. The stability that is guaranteed by GDC is weaker than Lyapunov stability, being ‘Lagrange stability plus convergence’. Input-to-state stability is obtained if the control law is continuous in the state. An example involving an open-loop unstable helicopter illustrates the efficacy of the approach in practice.National Research Foundation Singapor

    Dissipative Imitation Learning for Discrete Dynamic Output Feedback Control with Sparse Data Sets

    Full text link
    Imitation learning enables the synthesis of controllers for complex objectives and highly uncertain plant models. However, methods to provide stability guarantees to imitation learned controllers often rely on large amounts of data and/or known plant models. In this paper, we explore an input-output (IO) stability approach to dissipative imitation learning, which achieves stability with sparse data sets and with little known about the plant model. A closed-loop stable dynamic output feedback controller is learned using expert data, a coarse IO plant model, and a new constraint to enforce dissipativity on the learned controller. While the learning objective is nonconvex, iterative convex overbounding (ICO) and projected gradient descent (PGD) are explored as methods to successfully learn the controller. This new imitation learning method is applied to two unknown plants and compared to traditionally learned dynamic output feedback controller and neural network controller. With little knowledge of the plant model and a small data set, the dissipativity constrained learned controller achieves closed loop stability and successfully mimics the behavior of the expert controller, while other methods often fail to maintain stability and achieve good performance

    Relaxed dissipativity assumptions and a simplified algorithm for multiobjective MPC

    Get PDF
    We consider nonlinear model predictive control (MPC) with multiple competing cost functions. In each step of the scheme, a multiobjective optimal control problem with a nonlinear system and terminal conditions is solved. We propose an algorithm and give performance guarantees for the resulting MPC closed loop system. Thereby, we significantly simplify the assumptions made in the literature so far by assuming strict dissipativity and the existence of a compatible terminal cost for one of the competing objective functions only. We give conditions which ensure asymptotic stability of the closed loop and, what is more, obtain performance estimates for all cost criteria. Numerical simulations on various instances illustrate our findings. The proposed algorithm requires the selection of an efficient solution in each iteration, thus we examine several selection rules and their impact on the results
    corecore