48 research outputs found
Convergence of the Deep BSDE Method for Coupled FBSDEs
The recently proposed numerical algorithm, deep BSDE method, has shown
remarkable performance in solving high-dimensional forward-backward stochastic
differential equations (FBSDEs) and parabolic partial differential equations
(PDEs). This article lays a theoretical foundation for the deep BSDE method in
the general case of coupled FBSDEs. In particular, a posteriori error
estimation of the solution is provided and it is proved that the error
converges to zero given the universal approximation capability of neural
networks. Numerical results are presented to demonstrate the accuracy of the
analyzed algorithm in solving high-dimensional coupled FBSDEs
DeePMD-kit: A deep learning package for many-body potential energy representation and molecular dynamics
Recent developments in many-body potential energy representation via deep
learning have brought new hopes to addressing the accuracy-versus-efficiency
dilemma in molecular simulations. Here we describe DeePMD-kit, a package
written in Python/C++ that has been designed to minimize the effort required to
build deep learning based representation of potential energy and force field
and to perform molecular dynamics. Potential applications of DeePMD-kit span
from finite molecules to extended systems and from metallic systems to
chemically bonded systems. DeePMD-kit is interfaced with TensorFlow, one of the
most popular deep learning frameworks, making the training process highly
automatic and efficient. On the other end, DeePMD-kit is interfaced with
high-performance classical molecular dynamics and quantum (path-integral)
molecular dynamics packages, i.e., LAMMPS and the i-PI, respectively. Thus,
upon training, the potential energy and force field models can be used to
perform efficient molecular simulations for different purposes. As an example
of the many potential applications of the package, we use DeePMD-kit to learn
the interatomic potential energy and forces of a water model using data
obtained from density functional theory. We demonstrate that the resulted
molecular dynamics model reproduces accurately the structural information
contained in the original model
Solving high-dimensional partial differential equations using deep learning
Developing algorithms for solving high-dimensional partial differential
equations (PDEs) has been an exceedingly difficult task for a long time, due to
the notoriously difficult problem known as the "curse of dimensionality". This
paper introduces a deep learning-based approach that can handle general
high-dimensional parabolic PDEs. To this end, the PDEs are reformulated using
backward stochastic differential equations and the gradient of the unknown
solution is approximated by neural networks, very much in the spirit of deep
reinforcement learning with the gradient acting as the policy function.
Numerical results on examples including the nonlinear Black-Scholes equation,
the Hamilton-Jacobi-Bellman equation, and the Allen-Cahn equation suggest that
the proposed algorithm is quite effective in high dimensions, in terms of both
accuracy and cost. This opens up new possibilities in economics, finance,
operational research, and physics, by considering all participating agents,
assets, resources, or particles together at the same time, instead of making ad
hoc assumptions on their inter-relationships.Comment: 13 pages, 6 figure
Solving Many-Electron Schr\"odinger Equation Using Deep Neural Networks
We introduce a new family of trial wave-functions based on deep neural
networks to solve the many-electron Schr\"odinger equation. The Pauli exclusion
principle is dealt with explicitly to ensure that the trial wave-functions are
physical. The optimal trial wave-function is obtained through variational Monte
Carlo and the computational cost scales quadratically with the number of
electrons. The algorithm does not make use of any prior knowledge such as
atomic orbitals. Yet it is able to represent accurately the ground-states of
the tested systems, including He, H2, Be, B, LiH, and a chain of 10 hydrogen
atoms. This opens up new possibilities for solving large-scale many-electron
Schr\"odinger equation
A Mean-Field Optimal Control Formulation of Deep Learning
Recent work linking deep neural networks and dynamical systems opened up new
avenues to analyze deep learning. In particular, it is observed that new
insights can be obtained by recasting deep learning as an optimal control
problem on difference or differential equations. However, the mathematical
aspects of such a formulation have not been systematically explored. This paper
introduces the mathematical formulation of the population risk minimization
problem in deep learning as a mean-field optimal control problem. Mirroring the
development of classical optimal control, we state and prove optimality
conditions of both the Hamilton-Jacobi-Bellman type and the Pontryagin type.
These mean-field results reflect the probabilistic nature of the learning
problem. In addition, by appealing to the mean-field Pontryagin's maximum
principle, we establish some quantitative relationships between population and
empirical learning problems. This serves to establish a mathematical foundation
for investigating the algorithmic and theoretical connections between optimal
control and deep learning.Comment: 44 page
Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations
We propose a new algorithm for solving parabolic partial differential
equations (PDEs) and backward stochastic differential equations (BSDEs) in high
dimension, by making an analogy between the BSDE and reinforcement learning
with the gradient of the solution playing the role of the policy function, and
the loss function given by the error between the prescribed terminal condition
and the solution of the BSDE. The policy function is then approximated by a
neural network, as is done in deep reinforcement learning. Numerical results
using TensorFlow illustrate the efficiency and accuracy of the proposed
algorithms for several 100-dimensional nonlinear PDEs from physics and finance
such as the Allen-Cahn equation, the Hamilton-Jacobi-Bellman equation, and a
nonlinear pricing model for financial derivatives.Comment: 39 pages, 15 figure
PGDOT -- Perturbed Gradient Descent Adapted with Occupation Time
This paper develops further the idea of perturbed gradient descent (PGD), by
adapting perturbation with the history of states via the notion of occupation
time. The proposed algorithm, perturbed gradient descent adapted with
occupation time (PGDOT), is shown to converge at least as fast as the PGD
algorithm and is guaranteed to avoid getting stuck at saddle points. The
analysis is corroborated by empirical studies, in which a mini-batch version of
PGDOT is shown to outperform alternatives such as mini-batch gradient descent,
Adam, AMSGrad, and RMSProp in training multilayer perceptrons (MLPs). In
particular, the mini-batch PGDOT manages to escape saddle points whereas these
alternatives fail.Comment: 15 pages, 7 figures, 1 tabl
Deep Potential: a general representation of a many-body potential energy surface
We present a simple, yet general, end-to-end deep neural network
representation of the potential energy surface for atomic and molecular
systems. This methodology, which we call Deep Potential, is "first-principle"
based, in the sense that no ad hoc approximations or empirical fitting
functions are required. The neural network structure naturally respects the
underlying symmetries of the systems. When tested on a wide variety of
examples, Deep Potential is able to reproduce the original model, whether
empirical or quantum mechanics based, within chemical accuracy. The
computational cost of this new model is not substantially larger than that of
empirical force fields. In addition, the method has promising scalability
properties. This brings us one step closer to being able to carry out molecular
simulations with accuracy comparable to that of quantum mechanics models and
computational cost comparable to that of empirical potentials
Recurrent Neural Networks for Stochastic Control Problems with Delay
Stochastic control problems with delay are challenging due to the
path-dependent feature of the system and thus its intrinsic high dimensions. In
this paper, we propose and systematically study deep neural networks-based
algorithms to solve stochastic control problems with delay features.
Specifically, we employ neural networks for sequence modeling (\emph{e.g.},
recurrent neural networks such as long short-term memory) to parameterize the
policy and optimize the objective function. The proposed algorithms are tested
on three benchmark examples: a linear-quadratic problem, optimal consumption
with fixed finite delay, and portfolio optimization with complete memory.
Particularly, we notice that the architecture of recurrent neural networks
naturally captures the path-dependent feature with much flexibility and yields
better performance with more efficient and stable training of the network
compared to feedforward networks. The superiority is even evident in the case
of portfolio optimization with complete memory, which features infinite delay
Deep Fictitious Play for Finding Markovian Nash Equilibrium in Multi-Agent Games
We propose a deep neural network-based algorithm to identify the Markovian
Nash equilibrium of general large -player stochastic differential games.
Following the idea of fictitious play, we recast the -player game into
decoupled decision problems (one for each player) and solve them iteratively.
The individual decision problem is characterized by a semilinear
Hamilton-Jacobi-Bellman equation, to solve which we employ the recently
developed deep BSDE method. The resulted algorithm can solve large -player
games for which conventional numerical methods would suffer from the curse of
dimensionality. Multiple numerical examples involving identical or
heterogeneous agents, with risk-neutral or risk-sensitive objectives, are
tested to validate the accuracy of the proposed algorithm in large group games.
Even for a fifty-player game with the presence of common noise, the proposed
algorithm still finds the approximate Nash equilibrium accurately, which, to
our best knowledge, is difficult to achieve by other numerical algorithms