Search CORE

310 research outputs found

Adaptive deep learning for high-dimensional hamilton-jacobi-bellman equations

Author: Gong Qi
Kang Wei
Nakamura-Zimmerer Tenavi
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/01/2021
Field of study

The article of record as published may be found at http://dx.doi.org/10.1137/19M1288802Computing optimal feedback controls for nonlinear systems generally requires solving Hamilton--Jacobi--Bellman (HJB) equations, which are notoriously difficult when the state dimension is large. Existing strategies for high-dimensional problems often rely on specific, restrictive problem structures or are valid only locally around some nominal trajectory. In this paper, we propose a data-driven method to approximate semiglobal solutions to HJB equations for general high-dimensional nonlinear systems and compute candidate optimal feedback controls in real-time. To accomplish this, we model solutions to HJB equations with neural networks (NNs) trained on data generated without discretizing the state space. Training is made more effective and data-efficient by leveraging the known physics of the problem and using the partially trained NN to aid in adaptive data generation. We demonstrate the effectiveness of our method by learning solutions to HJB equations corresponding to the attitude control of a six-dimensional nonlinear rigid body and nonlinear systems of dimension up to 30 arising from the stabilization of a Burgers'-type partial differential equation. The trained NNs are then used for real-time feedback control of these systems.Defense Advanced Research Projects Agency (DARPA)The work of the first and second authors was partially supported with funding from the Defense Advanced Research Projects Agency (DARPA) grant FA8650-18-1-7842

Calhoun, Institutional Archive of the Naval Postgraduate School

QRnet: optimal regulator design with LQR-augmented neural networks

Author: Gong Qi
Kang Wei
Nakamura-Zimmerer Tenavi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 16/11/2020
Field of study

In this paper we propose a new computational method for designing optimal regulators for high-dimensional nonlinear systems. The proposed approach leverages physics-informed machine learning to solve high-dimensional Hamilton-Jacobi-Bellman equations arising in optimal feedback control. Concretely, we augment linear quadratic regulators with neural networks to handle nonlinearities. We train the augmented models on data generated without discretizing the state space, enabling application to high-dimensional problems. We use the proposed method to design a candidate optimal regulator for an unstable Burgers' equation, and through this example, demonstrate improved robustness and accuracy compared to existing neural network formulations.Comment: Added IEEE accepted manuscript with copyright notic

arXiv.org e-Print Archive

eScholarship - University of California

Calhoun, Institutional Archive of the Naval Postgraduate School

Adaptive Deep Learning for High-Dimensional Hamilton-Jacobi-Bellman Equations

Author: Gong Qi
Kang Wei
Nakamura-Zimmerer Tenavi
Publication venue
Publication date: 01/01/2021
Field of study

Computing optimal feedback controls for nonlinear systems generally requires solving Hamilton-Jacobi-Bellman (HJB) equations, which are notoriously difficult when the state dimension is large. Existing strategies for high-dimensional problems often rely on specific, restrictive problem structures, or are valid only locally around some nominal trajectory. In this paper, we propose a data-driven method to approximate semi-global solutions to HJB equations for general high-dimensional nonlinear systems and compute candidate optimal feedback controls in real-time. To accomplish this, we model solutions to HJB equations with neural networks (NNs) trained on data generated without discretizing the state space. Training is made more effective and data-efficient by leveraging the known physics of the problem and using the partially-trained NN to aid in adaptive data generation. We demonstrate the effectiveness of our method by learning solutions to HJB equations corresponding to the attitude control of a six-dimensional nonlinear rigid body, and nonlinear systems of dimension up to 30 arising from the stabilization of a Burgers'-type partial differential equation. The trained NNs are then used for real-time feedback control of these systems.Comment: Added section on validation error computation. Updated convergence test formula and associated result

arXiv.org e-Print Archive

eScholarship - University of California

Calhoun, Institutional Archive of the Naval Postgraduate School

Recommended from our members

Algorithms of data generation for deep learning and feedback design: A survey

Author: Fahroo Fariba
Gong Qi
Kang Wei
Nakamura-Zimmerer Tenavi
Publication venue: 'Elsevier BV'
Publication date: 01/01/2021
Field of study

17 USC 105 interim-entered record; under review.The article of record as published may be found at https://doi.org/10.1016/j.physd.2021.132955Recent research reveals that deep learning is an effective way of solving high dimensional Hamilton– Jacobi–Bellman equations. The resulting feedback control law in the form of a neural network is computationally efficient for real-time applications of optimal control. A critical part of this design method is to generate data for training the neural network and validating its accuracy. In this paper, we provide a survey of existing algorithms that can be used to generate data. All the algorithms surveyed in this paper are causality-free, i.e., the solution at a point is computed without using the value of the function at any other points. An illustrative example is given for the optimal feedback design using supervised learning in which the data is generated using causality-free algorithms.U.S. Naval Research Laborator

eScholarship - University of California

Calhoun, Institutional Archive of the Naval Postgraduate School

Learning Control Policies of Hodgkin-Huxley Neuronal Dynamics

Author: Madondo Malvern
Ruthotto Lars
Verma Deepanshu
Yong Nicholas Au
Publication venue
Publication date: 13/11/2023
Field of study

We present a neural network approach for closed-loop deep brain stimulation (DBS). We cast the problem of finding an optimal neurostimulation strategy as a control problem. In this setting, control policies aim to optimize therapeutic outcomes by tailoring the parameters of a DBS system, typically via electrical stimulation, in real time based on the patient's ongoing neuronal activity. We approximate the value function offline using a neural network to enable generating controls (stimuli) in real time via the feedback form. The neuronal activity is characterized by a nonlinear, stiff system of differential equations as dictated by the Hodgkin-Huxley model. Our training process leverages the relationship between Pontryagin's maximum principle and Hamilton-Jacobi-Bellman equations to update the value function estimates simultaneously. Our numerical experiments illustrate the accuracy of our approach for out-of-distribution samples and the robustness to moderate shocks and disturbances in the system.Comment: Extended Abstract presented at Machine Learning for Health (ML4H) symposium 2023, December 10th, 2023, New Orleans, United States, 12 page

arXiv.org e-Print Archive

Finite-horizon optimal control of linear and a class of nonlinear systems

Author: Zhao Qiming
Publication venue: Scholars\u27 Mine
Publication date: 01/01/2013
Field of study

Traditionally, optimal control of dynamical systems with known system dynamics is obtained in a backward-in-time and offline manner either by using Riccati or Hamilton-Jacobi-Bellman (HJB) equation. In contrast, in this dissertation, finite-horizon optimal regulation has been investigated for both linear and nonlinear systems in a forward-in-time manner when system dynamics are uncertain. Value and policy iterations are not used while the value function (or Q-function for linear systems) and control input are updated once a sampling interval consistent with standard adaptive control. First, the optimal adaptive control of linear discrete-time systems with unknown system dynamics is presented in Paper I by using Q-learning and Bellman equation while satisfying the terminal constraint. A novel update law that uses history information of the cost to go is derived. Paper II considers the design of the linear quadratic regulator in the presence of state and input quantization. Quantization errors are eliminated via a dynamic quantizer design and the parameter update law is redesigned from Paper I. Furthermore, an optimal adaptive state feedback controller is developed in Paper III for the general nonlinear discrete-time systems in affine form without the knowledge of system dynamics. In Paper IV, a NN-based observer is proposed to reconstruct the state vector and identify the dynamics so that the control scheme from Paper III is extended to output feedback. Finally, the optimal regulation of quantized nonlinear systems with input constraint is considered in Paper V by introducing a non-quadratic cost functional. Closed-loop stability is demonstrated for all the controller designs developed in this dissertation by using Lyapunov analysis while all the proposed schemes function in an online and forward-in-time manner so that they are practically viable --Abstract, page iv

Missouri University of Science and Technology (Missouri S&T): Scholars' Mine

Approximate dynamic programming based solutions for fixed-final-time optimal control and optimal switching

Author: Heydari Ali
Publication venue: Scholars\u27 Mine
Publication date: 01/01/2013
Field of study

Optimal solutions with neural networks (NN) based on an approximate dynamic programming (ADP) framework for new classes of engineering and non-engineering problems and associated difficulties and challenges are investigated in this dissertation. In the enclosed eight papers, the ADP framework is utilized for solving fixed-final-time problems (also called terminal control problems) and problems with switching nature. An ADP based algorithm is proposed in Paper 1 for solving fixed-final-time problems with soft terminal constraint, in which, a single neural network with a single set of weights is utilized. Paper 2 investigates fixed-final-time problems with hard terminal constraints. The optimality analysis of the ADP based algorithm for fixed-final-time problems is the subject of Paper 3, in which, it is shown that the proposed algorithm leads to the global optimal solution providing certain conditions hold. Afterwards, the developments in Papers 1 to 3 are used to tackle a more challenging class of problems, namely, optimal control of switching systems. This class of problems is divided into problems with fixed mode sequence (Papers 4 and 5) and problems with free mode sequence (Papers 6 and 7). Each of these two classes is further divided into problems with autonomous subsystems (Papers 4 and 6) and problems with controlled subsystems (Papers 5 and 7). Different ADP-based algorithms are developed and proofs of convergence of the proposed iterative algorithms are presented. Moreover, an extension to the developments is provided for online learning of the optimal switching solution for problems with modeling uncertainty in Paper 8. Each of the theoretical developments is numerically analyzed using different real-world or benchmark problems --Abstract, page v

Missouri University of Science and Technology (Missouri S&T): Scholars' Mine