310 research outputs found

    Adaptive deep learning for high-dimensional hamilton-jacobi-bellman equations

    Get PDF
    The article of record as published may be found at http://dx.doi.org/10.1137/19M1288802Computing optimal feedback controls for nonlinear systems generally requires solving Hamilton--Jacobi--Bellman (HJB) equations, which are notoriously difficult when the state dimension is large. Existing strategies for high-dimensional problems often rely on specific, restrictive problem structures or are valid only locally around some nominal trajectory. In this paper, we propose a data-driven method to approximate semiglobal solutions to HJB equations for general high-dimensional nonlinear systems and compute candidate optimal feedback controls in real-time. To accomplish this, we model solutions to HJB equations with neural networks (NNs) trained on data generated without discretizing the state space. Training is made more effective and data-efficient by leveraging the known physics of the problem and using the partially trained NN to aid in adaptive data generation. We demonstrate the effectiveness of our method by learning solutions to HJB equations corresponding to the attitude control of a six-dimensional nonlinear rigid body and nonlinear systems of dimension up to 30 arising from the stabilization of a Burgers'-type partial differential equation. The trained NNs are then used for real-time feedback control of these systems.Defense Advanced Research Projects Agency (DARPA)The work of the first and second authors was partially supported with funding from the Defense Advanced Research Projects Agency (DARPA) grant FA8650-18-1-7842

    QRnet: optimal regulator design with LQR-augmented neural networks

    Get PDF
    In this paper we propose a new computational method for designing optimal regulators for high-dimensional nonlinear systems. The proposed approach leverages physics-informed machine learning to solve high-dimensional Hamilton-Jacobi-Bellman equations arising in optimal feedback control. Concretely, we augment linear quadratic regulators with neural networks to handle nonlinearities. We train the augmented models on data generated without discretizing the state space, enabling application to high-dimensional problems. We use the proposed method to design a candidate optimal regulator for an unstable Burgers' equation, and through this example, demonstrate improved robustness and accuracy compared to existing neural network formulations.Comment: Added IEEE accepted manuscript with copyright notic

    Adaptive Deep Learning for High-Dimensional Hamilton-Jacobi-Bellman Equations

    Get PDF
    Computing optimal feedback controls for nonlinear systems generally requires solving Hamilton-Jacobi-Bellman (HJB) equations, which are notoriously difficult when the state dimension is large. Existing strategies for high-dimensional problems often rely on specific, restrictive problem structures, or are valid only locally around some nominal trajectory. In this paper, we propose a data-driven method to approximate semi-global solutions to HJB equations for general high-dimensional nonlinear systems and compute candidate optimal feedback controls in real-time. To accomplish this, we model solutions to HJB equations with neural networks (NNs) trained on data generated without discretizing the state space. Training is made more effective and data-efficient by leveraging the known physics of the problem and using the partially-trained NN to aid in adaptive data generation. We demonstrate the effectiveness of our method by learning solutions to HJB equations corresponding to the attitude control of a six-dimensional nonlinear rigid body, and nonlinear systems of dimension up to 30 arising from the stabilization of a Burgers'-type partial differential equation. The trained NNs are then used for real-time feedback control of these systems.Comment: Added section on validation error computation. Updated convergence test formula and associated result

    Learning Control Policies of Hodgkin-Huxley Neuronal Dynamics

    Full text link
    We present a neural network approach for closed-loop deep brain stimulation (DBS). We cast the problem of finding an optimal neurostimulation strategy as a control problem. In this setting, control policies aim to optimize therapeutic outcomes by tailoring the parameters of a DBS system, typically via electrical stimulation, in real time based on the patient's ongoing neuronal activity. We approximate the value function offline using a neural network to enable generating controls (stimuli) in real time via the feedback form. The neuronal activity is characterized by a nonlinear, stiff system of differential equations as dictated by the Hodgkin-Huxley model. Our training process leverages the relationship between Pontryagin's maximum principle and Hamilton-Jacobi-Bellman equations to update the value function estimates simultaneously. Our numerical experiments illustrate the accuracy of our approach for out-of-distribution samples and the robustness to moderate shocks and disturbances in the system.Comment: Extended Abstract presented at Machine Learning for Health (ML4H) symposium 2023, December 10th, 2023, New Orleans, United States, 12 page

    Finite-horizon optimal control of linear and a class of nonlinear systems

    Get PDF
    Traditionally, optimal control of dynamical systems with known system dynamics is obtained in a backward-in-time and offline manner either by using Riccati or Hamilton-Jacobi-Bellman (HJB) equation. In contrast, in this dissertation, finite-horizon optimal regulation has been investigated for both linear and nonlinear systems in a forward-in-time manner when system dynamics are uncertain. Value and policy iterations are not used while the value function (or Q-function for linear systems) and control input are updated once a sampling interval consistent with standard adaptive control. First, the optimal adaptive control of linear discrete-time systems with unknown system dynamics is presented in Paper I by using Q-learning and Bellman equation while satisfying the terminal constraint. A novel update law that uses history information of the cost to go is derived. Paper II considers the design of the linear quadratic regulator in the presence of state and input quantization. Quantization errors are eliminated via a dynamic quantizer design and the parameter update law is redesigned from Paper I. Furthermore, an optimal adaptive state feedback controller is developed in Paper III for the general nonlinear discrete-time systems in affine form without the knowledge of system dynamics. In Paper IV, a NN-based observer is proposed to reconstruct the state vector and identify the dynamics so that the control scheme from Paper III is extended to output feedback. Finally, the optimal regulation of quantized nonlinear systems with input constraint is considered in Paper V by introducing a non-quadratic cost functional. Closed-loop stability is demonstrated for all the controller designs developed in this dissertation by using Lyapunov analysis while all the proposed schemes function in an online and forward-in-time manner so that they are practically viable --Abstract, page iv

    Approximate dynamic programming based solutions for fixed-final-time optimal control and optimal switching

    Get PDF
    Optimal solutions with neural networks (NN) based on an approximate dynamic programming (ADP) framework for new classes of engineering and non-engineering problems and associated difficulties and challenges are investigated in this dissertation. In the enclosed eight papers, the ADP framework is utilized for solving fixed-final-time problems (also called terminal control problems) and problems with switching nature. An ADP based algorithm is proposed in Paper 1 for solving fixed-final-time problems with soft terminal constraint, in which, a single neural network with a single set of weights is utilized. Paper 2 investigates fixed-final-time problems with hard terminal constraints. The optimality analysis of the ADP based algorithm for fixed-final-time problems is the subject of Paper 3, in which, it is shown that the proposed algorithm leads to the global optimal solution providing certain conditions hold. Afterwards, the developments in Papers 1 to 3 are used to tackle a more challenging class of problems, namely, optimal control of switching systems. This class of problems is divided into problems with fixed mode sequence (Papers 4 and 5) and problems with free mode sequence (Papers 6 and 7). Each of these two classes is further divided into problems with autonomous subsystems (Papers 4 and 6) and problems with controlled subsystems (Papers 5 and 7). Different ADP-based algorithms are developed and proofs of convergence of the proposed iterative algorithms are presented. Moreover, an extension to the developments is provided for online learning of the optimal switching solution for problems with modeling uncertainty in Paper 8. Each of the theoretical developments is numerically analyzed using different real-world or benchmark problems --Abstract, page v
    • …
    corecore