333 research outputs found

    Data-driven Economic NMPC using Reinforcement Learning

    Get PDF
    Reinforcement Learning (RL) is a powerful tool to perform data-driven optimal control without relying on a model of the system. However, RL struggles to provide hard guarantees on the behavior of the resulting control scheme. In contrast, Nonlinear Model Predictive Control (NMPC) and Economic NMPC (ENMPC) are standard tools for the closed-loop optimal control of complex systems with constraints and limitations, and benefit from a rich theory to assess their closed-loop behavior. Unfortunately, the performance of (E)NMPC hinges on the quality of the model underlying the control scheme. In this paper, we show that an (E)NMPC scheme can be tuned to deliver the optimal policy of the real system even when using a wrong model. This result also holds for real systems having stochastic dynamics. This entails that ENMPC can be used as a new type of function approximator within RL. Furthermore, we investigate our results in the context of ENMPC and formally connect them to the concept of dissipativity, which is central for the ENMPC stability. Finally, we detail how these results can be used to deploy classic RL tools for tuning (E)NMPC schemes. We apply these tools on both a classical linear MPC setting and a standard nonlinear example from the ENMPC literature

    Reinforcement Learning Based on Real-Time Iteration NMPC

    Get PDF
    Reinforcement Learning (RL) has proven a stunning ability to learn optimal policies from data without any prior knowledge on the process. The main drawback of RL is that it is typically very difficult to guarantee stability and safety. On the other hand, Nonlinear Model Predictive Control (NMPC) is an advanced model-based control technique which does guarantee safety and stability, but only yields optimality for the nominal model. Therefore, it has been recently proposed to use NMPC as a function approximator within RL. While the ability of this approach to yield good performance has been demonstrated, the main drawback hindering its applicability is related to the computational burden of NMPC, which has to be solved to full convergence. In practice, however, computationally efficient algorithms such as the Real-Time Iteration (RTI) scheme are deployed in order to return an approximate NMPC solution in very short time. In this paper we bridge this gap by extending the existing theoretical framework to also cover RL based on RTI NMPC. We demonstrate the effectiveness of this new RL approach with a nontrivial example modeling a challenging nonlinear system subject to stochastic perturbations with the objective of optimizing an economic cost.Comment: accepted for the IFAC World Congress 202

    A Parallel Decomposition Scheme for Solving Long-Horizon Optimal Control Problems

    Full text link
    We present a temporal decomposition scheme for solving long-horizon optimal control problems. In the proposed scheme, the time domain is decomposed into a set of subdomains with partially overlapping regions. Subproblems associated with the subdomains are solved in parallel to obtain local primal-dual trajectories that are assembled to obtain the global trajectories. We provide a sufficient condition that guarantees convergence of the proposed scheme. This condition states that the effect of perturbations on the boundary conditions (i.e., initial state and terminal dual/adjoint variable) should decay asymptotically as one moves away from the boundaries. This condition also reveals that the scheme converges if the size of the overlap is sufficiently large and that the convergence rate improves with the size of the overlap. We prove that linear quadratic problems satisfy the asymptotic decay condition, and we discuss numerical strategies to determine if the condition holds in more general cases. We draw upon a non-convex optimal control problem to illustrate the performance of the proposed scheme

    Periodic optimal control, dissipativity and MPC

    Get PDF
    Recent research has established the importance of dissipativity for proving stability of economic MPC in the case of a steady state. In many cases, though, steady state operation is not economically optimal and periodic operation of the system yields a better performance. In this paper, we propose three different ways of extending the notion of dissipativity for periodic systems and illustrate them with three examples

    Airborne Wind Energy Based on Dual Airfoils

    Get PDF
    The airborne wind energy (AWE) paradigm proposes to generate energy by flying a tethered airfoil across the wind flow at a high velocity. Although AWE enables flight in higher altitude and stronger wind layers, the extra drag generated by the tether motion imposes a significant limit to the overall system efficiency. To address this issue, two airfoils with a shared tether can reduce overall system drag. Although this technique may improve the efficiency of AWE systems, such improvement can only be achieved through properly balancing the system trajectories and parameters. This brief tackles that problem using optimal control. A generic procedure for modeling multiple-airfoil systems with equations of minimal complexity is proposed. A parametric study shows that at small and medium scales, dual-airfoil systems are significantly more efficient than single-airfoil systems, but they are less advantageous at very large scales

    Periodic Optimal Control, Dissipativity and MPC

    Get PDF
    Recent research has established the importance of (strict) dissipativity for proving stability of economic MPC in the case of an optimal steady state. In many cases, though, steady-state operation is not economically optimal and periodic operation of the system yields a better performance. In this technical note, we propose ways of extending the notion of (strict) dissipativity for periodic systems. We prove that optimal P-periodic operation and MPC stability directly follow, similarly to the steady-state case, which can be seen as a special case of the proposed framework. Finally, we illustrate the theoretical results with several simple examples
    corecore