Search CORE

223 research outputs found

Data-driven Economic NMPC using Reinforcement Learning

Author: Gros Sébastien
Zanon Mario
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 08/04/2019
Field of study

Reinforcement Learning (RL) is a powerful tool to perform data-driven optimal control without relying on a model of the system. However, RL struggles to provide hard guarantees on the behavior of the resulting control scheme. In contrast, Nonlinear Model Predictive Control (NMPC) and Economic NMPC (ENMPC) are standard tools for the closed-loop optimal control of complex systems with constraints and limitations, and benefit from a rich theory to assess their closed-loop behavior. Unfortunately, the performance of (E)NMPC hinges on the quality of the model underlying the control scheme. In this paper, we show that an (E)NMPC scheme can be tuned to deliver the optimal policy of the real system even when using a wrong model. This result also holds for real systems having stochastic dynamics. This entails that ENMPC can be used as a new type of function approximator within RL. Furthermore, we investigate our results in the context of ENMPC and formally connect them to the concept of dissipativity, which is central for the ENMPC stability. Finally, we detail how these results can be used to deploy classic RL tools for tuning (E)NMPC schemes. We apply these tools on both a classical linear MPC setting and a standard nonlinear example from the ENMPC literature

arXiv.org e-Print Archive

Archivio della ricerca della Scuola IMT Alti Studi Lucca

Reinforcement Learning Based on Real-Time Iteration NMPC

Author: Gros Sébastien
Kungurtsev Vyacheslav
Zanon Mario
Publication venue
Publication date: 01/01/2020
Field of study

Reinforcement Learning (RL) has proven a stunning ability to learn optimal policies from data without any prior knowledge on the process. The main drawback of RL is that it is typically very difficult to guarantee stability and safety. On the other hand, Nonlinear Model Predictive Control (NMPC) is an advanced model-based control technique which does guarantee safety and stability, but only yields optimality for the nominal model. Therefore, it has been recently proposed to use NMPC as a function approximator within RL. While the ability of this approach to yield good performance has been demonstrated, the main drawback hindering its applicability is related to the computational burden of NMPC, which has to be solved to full convergence. In practice, however, computationally efficient algorithms such as the Real-Time Iteration (RTI) scheme are deployed in order to return an approximate NMPC solution in very short time. In this paper we bridge this gap by extending the existing theoretical framework to also cover RL based on RTI NMPC. We demonstrate the effectiveness of this new RL approach with a nontrivial example modeling a challenging nonlinear system subject to stochastic perturbations with the objective of optimizing an economic cost.Comment: accepted for the IFAC World Congress 202

arXiv.org e-Print Archive

Archivio della ricerca della Scuola IMT Alti Studi Lucca

Experimental Validation of Safe MPC for Autonomous Driving in Uncertain Environments

Author: Zanon Mario
Publication venue
Publication date: 01/01/2023
Field of study

Archivio della ricerca della Scuola IMT Alti Studi Lucca

3D-reconstruction of connexin32 channels distribution in the myelin of Schwann cells by advanced optical microscopy

Author: Zanon Mario
Publication venue
Publication date: 08/04/2022
Field of study

Connexin 32 (Cx32) is a 32 kDa protein of the connexin family that is expressed in the peripheral nervous system where it localizes in the myelin sheath of Schwann cells. Mutations of Cx32 are the leading cause of the X-linked form of Charcot–Marie–Tooth disease (CMT1X), a peripheral neuropathy for which there is no cure. Alteration in the distribution and function of Cx32 channels are presumed to trigger the neuropathy, but the pathological mechanism is still unknown. In this thesis work we combined two-photon fluorescence microscopy with third harmonic generation of the myelin sheath of Schwann cells to analyze the distribution of Cx32 and 3D render it by a software we developed in Matlab. STED microscopy tests were carried out with the future perspective to obtain high-resolution 3D images of Cx32 distribution in nerve samples of CMT1X patients.ope

Padua Thesis and Dissertation Archive

A Gauss-Newton-Like Hessian Approximation for Economic NMPC

Author: Zanon Mario
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 28/10/2020
Field of study

Economic Model Predictive Control (EMPC) has recently become popular because of its ability to control constrained nonlinear systems while explicitly optimizing a prescribed performance criterion. Large performance gains have been reported for many applications and closed-loop stability has been recently investigated. However, computational performance still remains an open issue and only few contributions have proposed real-time algorithms tailored to EMPC. We perform a step towards computationally cheap algorithms for EMPC by proposing a new positive-definite Hessian approximation which does not hinder fast convergence and is suitable for being used within the real-time iteration (RTI) scheme. We provide two simulation examples to demonstrate the effectiveness of RTI-based EMPC relying on the proposed Hessian approximation

arXiv.org e-Print Archive

Archivio della ricerca della Scuola IMT Alti Studi Lucca

Equivalence of Optimality Criteria for Markov Decision Process and Model Predictive Control

Author: Zanon Mario
Publication venue
Publication date: 01/01/9999
Field of study

Archivio della ricerca della Scuola IMT Alti Studi Lucca

A Parallel Decomposition Scheme for Solving Long-Horizon Optimal Control Problems

Author: Faulwasser Timm
Shin Sungho
Zanon Mario
Zavala Victor M.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 13/09/2019
Field of study

We present a temporal decomposition scheme for solving long-horizon optimal control problems. In the proposed scheme, the time domain is decomposed into a set of subdomains with partially overlapping regions. Subproblems associated with the subdomains are solved in parallel to obtain local primal-dual trajectories that are assembled to obtain the global trajectories. We provide a sufficient condition that guarantees convergence of the proposed scheme. This condition states that the effect of perturbations on the boundary conditions (i.e., initial state and terminal dual/adjoint variable) should decay asymptotically as one moves away from the boundaries. This condition also reveals that the scheme converges if the size of the overlap is sufficiently large and that the convergence rate improves with the size of the overlap. We prove that linear quadratic problems satisfy the asymptotic decay condition, and we discuss numerical strategies to determine if the condition holds in more general cases. We draw upon a non-convex optimal control problem to illustrate the performance of the proposed scheme

arXiv.org e-Print Archive

Crossref