Search CORE

48 research outputs found

Convergence of the Deep BSDE Method for Coupled FBSDEs

Author: Han Jiequn
Long Jihao
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 05/10/2019
Field of study

The recently proposed numerical algorithm, deep BSDE method, has shown remarkable performance in solving high-dimensional forward-backward stochastic differential equations (FBSDEs) and parabolic partial differential equations (PDEs). This article lays a theoretical foundation for the deep BSDE method in the general case of coupled FBSDEs. In particular, a posteriori error estimation of the solution is provided and it is proved that the error converges to zero given the universal approximation capability of neural networks. Numerical results are presented to demonstrate the accuracy of the analyzed algorithm in solving high-dimensional coupled FBSDEs

arXiv.org e-Print Archive

DeePMD-kit: A deep learning package for many-body potential energy representation and molecular dynamics

Author: E Weinan
Han Jiequn
Wang Han
Zhang Linfeng
Publication venue: 'Elsevier BV'
Publication date: 30/12/2017
Field of study

Recent developments in many-body potential energy representation via deep learning have brought new hopes to addressing the accuracy-versus-efficiency dilemma in molecular simulations. Here we describe DeePMD-kit, a package written in Python/C++ that has been designed to minimize the effort required to build deep learning based representation of potential energy and force field and to perform molecular dynamics. Potential applications of DeePMD-kit span from finite molecules to extended systems and from metallic systems to chemically bonded systems. DeePMD-kit is interfaced with TensorFlow, one of the most popular deep learning frameworks, making the training process highly automatic and efficient. On the other end, DeePMD-kit is interfaced with high-performance classical molecular dynamics and quantum (path-integral) molecular dynamics packages, i.e., LAMMPS and the i-PI, respectively. Thus, upon training, the potential energy and force field models can be used to perform efficient molecular simulations for different purposes. As an example of the many potential applications of the package, we use DeePMD-kit to learn the interatomic potential energy and forces of a water model using data obtained from density functional theory. We demonstrate that the resulted molecular dynamics model reproduces accurately the structural information contained in the original model

arXiv.org e-Print Archive

Solving high-dimensional partial differential equations using deep learning

Author: E Weinan
Han Jiequn
Jentzen Arnulf
Publication venue: 'Proceedings of the National Academy of Sciences'
Publication date: 03/07/2018
Field of study

Developing algorithms for solving high-dimensional partial differential equations (PDEs) has been an exceedingly difficult task for a long time, due to the notoriously difficult problem known as the "curse of dimensionality". This paper introduces a deep learning-based approach that can handle general high-dimensional parabolic PDEs. To this end, the PDEs are reformulated using backward stochastic differential equations and the gradient of the unknown solution is approximated by neural networks, very much in the spirit of deep reinforcement learning with the gradient acting as the policy function. Numerical results on examples including the nonlinear Black-Scholes equation, the Hamilton-Jacobi-Bellman equation, and the Allen-Cahn equation suggest that the proposed algorithm is quite effective in high dimensions, in terms of both accuracy and cost. This opens up new possibilities in economics, finance, operational research, and physics, by considering all participating agents, assets, resources, or particles together at the same time, instead of making ad hoc assumptions on their inter-relationships.Comment: 13 pages, 6 figure

arXiv.org e-Print Archive

Solving Many-Electron Schr\"odinger Equation Using Deep Neural Networks

Author: E Weinan
Han Jiequn
Zhang Linfeng
Publication venue: 'Elsevier BV'
Publication date: 13/12/2018
Field of study

We introduce a new family of trial wave-functions based on deep neural networks to solve the many-electron Schr\"odinger equation. The Pauli exclusion principle is dealt with explicitly to ensure that the trial wave-functions are physical. The optimal trial wave-function is obtained through variational Monte Carlo and the computational cost scales quadratically with the number of electrons. The algorithm does not make use of any prior knowledge such as atomic orbitals. Yet it is able to represent accurately the ground-states of the tested systems, including He, H2, Be, B, LiH, and a chain of 10 hydrogen atoms. This opens up new possibilities for solving large-scale many-electron Schr\"odinger equation

arXiv.org e-Print Archive

A Mean-Field Optimal Control Formulation of Deep Learning

Author: E Weinan
Han Jiequn
Li Qianxiao
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 03/07/2018
Field of study

Recent work linking deep neural networks and dynamical systems opened up new avenues to analyze deep learning. In particular, it is observed that new insights can be obtained by recasting deep learning as an optimal control problem on difference or differential equations. However, the mathematical aspects of such a formulation have not been systematically explored. This paper introduces the mathematical formulation of the population risk minimization problem in deep learning as a mean-field optimal control problem. Mirroring the development of classical optimal control, we state and prove optimality conditions of both the Hamilton-Jacobi-Bellman type and the Pontryagin type. These mean-field results reflect the probabilistic nature of the learning problem. In addition, by appealing to the mean-field Pontryagin's maximum principle, we establish some quantitative relationships between population and empirical learning problems. This serves to establish a mathematical foundation for investigating the algorithmic and theoretical connections between optimal control and deep learning.Comment: 44 page

arXiv.org e-Print Archive

Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations

Author: E Weinan
Han Jiequn
Jentzen Arnulf
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 14/06/2017
Field of study

We propose a new algorithm for solving parabolic partial differential equations (PDEs) and backward stochastic differential equations (BSDEs) in high dimension, by making an analogy between the BSDE and reinforcement learning with the gradient of the solution playing the role of the policy function, and the loss function given by the error between the prescribed terminal condition and the solution of the BSDE. The policy function is then approximated by a neural network, as is done in deep reinforcement learning. Numerical results using TensorFlow illustrate the efficiency and accuracy of the proposed algorithms for several 100-dimensional nonlinear PDEs from physics and finance such as the Allen-Cahn equation, the Hamilton-Jacobi-Bellman equation, and a nonlinear pricing model for financial derivatives.Comment: 39 pages, 15 figure

arXiv.org e-Print Archive

PGDOT -- Perturbed Gradient Descent Adapted with Occupation Time

Author: Guo Xin
Han Jiequn
Tajrobehkar Mahan
Tang Wenpin
Publication venue
Publication date: 12/06/2021
Field of study

This paper develops further the idea of perturbed gradient descent (PGD), by adapting perturbation with the history of states via the notion of occupation time. The proposed algorithm, perturbed gradient descent adapted with occupation time (PGDOT), is shown to converge at least as fast as the PGD algorithm and is guaranteed to avoid getting stuck at saddle points. The analysis is corroborated by empirical studies, in which a mini-batch version of PGDOT is shown to outperform alternatives such as mini-batch gradient descent, Adam, AMSGrad, and RMSProp in training multilayer perceptrons (MLPs). In particular, the mini-batch PGDOT manages to escape saddle points whereas these alternatives fail.Comment: 15 pages, 7 figures, 1 tabl

arXiv.org e-Print Archive

Deep Potential: a general representation of a many-body potential energy surface

Author: Car Roberto
E Weinan
Han Jiequn
Zhang Linfeng
Publication venue: 'Global Science Press'
Publication date: 29/07/2017
Field of study

We present a simple, yet general, end-to-end deep neural network representation of the potential energy surface for atomic and molecular systems. This methodology, which we call Deep Potential, is "first-principle" based, in the sense that no ad hoc approximations or empirical fitting functions are required. The neural network structure naturally respects the underlying symmetries of the systems. When tested on a wide variety of examples, Deep Potential is able to reproduce the original model, whether empirical or quantum mechanics based, within chemical accuracy. The computational cost of this new model is not substantially larger than that of empirical force fields. In addition, the method has promising scalability properties. This brings us one step closer to being able to carry out molecular simulations with accuracy comparable to that of quantum mechanics models and computational cost comparable to that of empirical potentials

arXiv.org e-Print Archive

Recurrent Neural Networks for Stochastic Control Problems with Delay

Author: Han Jiequn
Hu Ruimeng
Publication venue
Publication date: 05/01/2021
Field of study

Stochastic control problems with delay are challenging due to the path-dependent feature of the system and thus its intrinsic high dimensions. In this paper, we propose and systematically study deep neural networks-based algorithms to solve stochastic control problems with delay features. Specifically, we employ neural networks for sequence modeling (\emph{e.g.}, recurrent neural networks such as long short-term memory) to parameterize the policy and optimize the objective function. The proposed algorithms are tested on three benchmark examples: a linear-quadratic problem, optimal consumption with fixed finite delay, and portfolio optimization with complete memory. Particularly, we notice that the architecture of recurrent neural networks naturally captures the path-dependent feature with much flexibility and yields better performance with more efficient and stable training of the network compared to feedforward networks. The superiority is even evident in the case of portfolio optimization with complete memory, which features infinite delay

arXiv.org e-Print Archive

Deep Fictitious Play for Finding Markovian Nash Equilibrium in Multi-Agent Games

Author: Han Jiequn
Hu Ruimeng
Publication venue
Publication date: 04/06/2020
Field of study

We propose a deep neural network-based algorithm to identify the Markovian Nash equilibrium of general large

N

-player stochastic differential games. Following the idea of fictitious play, we recast the

N

-player game into

N

decoupled decision problems (one for each player) and solve them iteratively. The individual decision problem is characterized by a semilinear Hamilton-Jacobi-Bellman equation, to solve which we employ the recently developed deep BSDE method. The resulted algorithm can solve large

N

-player games for which conventional numerical methods would suffer from the curse of dimensionality. Multiple numerical examples involving identical or heterogeneous agents, with risk-neutral or risk-sensitive objectives, are tested to validate the accuracy of the proposed algorithm in large group games. Even for a fifty-player game with the presence of common noise, the proposed algorithm still finds the approximate Nash equilibrium accurately, which, to our best knowledge, is difficult to achieve by other numerical algorithms

arXiv.org e-Print Archive