129 research outputs found

    Learning Stable Koopman Models for Identification and Control of Dynamical Systems

    Get PDF
    Learning models of dynamical systems from data is a widely-studied problem in control theory and machine learning. One recent approach for modelling nonlinear systems considers the class of Koopman models, which embeds the nonlinear dynamics in a higher-dimensional linear subspace. Learning a Koopman embedding would allow for the analysis and control of nonlinear systems using tools from linear systems theory. Many recent methods have been proposed for data-driven learning of such Koopman embeddings, but most of these methods do not consider the stability of the Koopman model. Stability is an important and desirable property for models of dynamical systems. Unstable models tend to be non-robust to input perturbations and can produce unbounded outputs, which are both undesirable when the model is used for prediction and control. In addition, recent work has shown that stability guarantees may act as a regularizer for model fitting. As such, a natural direction would be to construct Koopman models with inherent stability guarantees. Two new classes of Koopman models are proposed that bridge the gap between Koopman-based methods and learning stable nonlinear models. The first model class is guaranteed to be stable, while the second is guaranteed to be stabilizable with an explicit stabilizing controller that renders the model stable in closed-loop. Furthermore, these models are unconstrained in their parameter sets, thereby enabling efficient optimization via gradient-based methods. Theoretical connections between the stability of Koopman models and forms of nonlinear stability such as contraction are established. To demonstrate the effect of the stability guarantees, the stable Koopman model is applied to a system identification problem, while the stabilizable model is applied to an imitation learning problem. Experimental results show empirically that the proposed models achieve better performance over prior methods without stability guarantees

    On the equivalence of contraction and Koopman approaches for nonlinear stability and control

    Full text link
    In this paper we prove new connections between two frameworks for analysis and control of nonlinear systems: the Koopman operator framework and contraction analysis. Each method, in different ways, provides exact and global analyses of nonlinear systems by way of linear systems theory. The main results of this paper show equivalence between contraction and Koopman approaches for a wide class of stability analysis and control design problems. In particular: stability or stablizability in the Koopman framework implies the existence of a contraction metric (resp. control contraction metric) for the nonlinear system. Further in certain cases the converse holds: contraction implies the existence of a set of observables with which stability can be verified via the Koopman framework. We provide results for the cases of autonomous and time-varying systems, as well as orbital stability of limit cycles. Furthermore, the converse claims are based on a novel relation between the Koopman method and construction of a Kazantzis-Kravaris-Luenberger observer. We also provide a byproduct of the main results, that is, a new method to learn contraction metrics from trajectory data via linear system identification

    Neural Stochastic Contraction Metrics for Learning-based Control and Estimation

    Get PDF
    We present Neural Stochastic Contraction Metrics (NSCM), a new design framework for provably-stable learning-based control and estimation for a class of stochastic nonlinear systems. It uses a spectrally-normalized deep neural network to construct a contraction metric and its differential Lyapunov function, sampled via simplified convex optimization in the stochastic setting. Spectral normalization constrains the state-derivatives of the metric to be Lipschitz continuous, thereby ensuring exponential boundedness of the mean squared distance of system trajectories under stochastic disturbances. The trained NSCM model allows autonomous systems to approximate optimal stable control and estimation policies in real-time, and outperforms existing nonlinear control and estimation techniques including the state-dependent Riccati equation, iterative LQR, EKF, and the deterministic NCM, as shown in simulation results

    Learning over All Stabilizing Nonlinear Controllers for a Partially-Observed Linear System

    Full text link
    This paper proposes a nonlinear policy architecture for control of partially-observed linear dynamical systems providing built-in closed-loop stability guarantees. The policy is based on a nonlinear version of the Youla parameterization, and augments a known stabilizing linear controller with a nonlinear operator from a recently developed class of dynamic neural network models called the recurrent equilibrium network (REN). We prove that RENs are universal approximators of contracting and Lipschitz nonlinear systems, and subsequently show that the the proposed Youla-REN architecture is a universal approximator of stabilizing nonlinear controllers. The REN architecture simplifies learning since unconstrained optimization can be applied, and we consider both a model-based case where exact gradients are available and reinforcement learning using random search with zeroth-order oracles. In simulation examples our method converges faster to better controllers and is more scalable than existing methods, while guaranteeing stability during learning transients

    The Power of Linear Controllers in LQR Control

    Get PDF
    The Linear Quadratic Regulator (LQR) framework considers the problem of regulating a linear dynamical system perturbed by environmental noise. We compute the policy regret between three distinct control policies: i) the optimal online policy, whose linear structure is given by the Ricatti equations; ii) the optimal offline linear policy, which is the best linear state feedback policy given the noise sequence; and iii) the optimal offline policy, which selects the globally optimal control actions given the noise sequence. We fully characterize the optimal offline policy and show that it has a recursive form in terms of the optimal online policy and future disturbances. We also show that cost of the optimal offline linear policy converges to the cost of the optimal online policy as the time horizon grows large, and consequently the optimal offline linear policy incurs linear regret relative to the optimal offline policy, even in the optimistic setting where the noise is drawn i.i.d from a known distribution. Although we focus on the setting where the noise is stochastic, our results also imply new lower bounds on the policy regret achievable when the noise is chosen by an adaptive adversary
    • …