37,912 research outputs found
Parareal with a physics-informed neural network as coarse propagator
Parallel-in-time algorithms provide an additional layer of concurrency for
the numerical integration of models based on time-dependent differential
equations. Methods like Parareal, which parallelize across multiple time steps,
rely on a computationally cheap and coarse integrator to propagate information
forward in time, while a parallelizable expensive fine propagator provides
accuracy. Typically, the coarse method is a numerical integrator using lower
resolution, reduced order or a simplified model. Our paper proposes to use a
physics-informed neural network (PINN) instead. We demonstrate for the
Black-Scholes equation, a partial differential equation from computational
finance, that Parareal with a PINN coarse propagator provides better speedup
than a numerical coarse propagator. Training and evaluating a neural network
are both tasks whose computing patterns are well suited for GPUs. By contrast,
mesh-based algorithms with their low computational intensity struggle to
perform well. We show that moving the coarse propagator PINN to a GPU while
running the numerical fine propagator on the CPU further improves Parareal's
single-node performance. This suggests that integrating machine learning
techniques into parallel-in-time integration methods and exploiting their
differences in computing patterns might offer a way to better utilize
heterogeneous architectures.Comment: 13 pages, 7 figure
Learning data driven discretizations for partial differential equations
The numerical solution of partial differential equations (PDEs) is
challenging because of the need to resolve spatiotemporal features over wide
length and timescales. Often, it is computationally intractable to resolve the
finest features in the solution. The only recourse is to use approximate
coarse-grained representations, which aim to accurately represent
long-wavelength dynamics while properly accounting for unresolved small scale
physics. Deriving such coarse grained equations is notoriously difficult, and
often \emph{ad hoc}. Here we introduce \emph{data driven discretization}, a
method for learning optimized approximations to PDEs based on actual solutions
to the known underlying equations. Our approach uses neural networks to
estimate spatial derivatives, which are optimized end-to-end to best satisfy
the equations on a low resolution grid. The resulting numerical methods are
remarkably accurate, allowing us to integrate in time a collection of nonlinear
equations in one spatial dimension at resolutions 4-8x coarser than is possible
with standard finite difference methods.Comment: YBS and SH contributed equally to this work. 7 pages, 4 figures (+
Appendix: 9 pages, 10 figures
Stable Architectures for Deep Neural Networks
Deep neural networks have become invaluable tools for supervised machine
learning, e.g., classification of text or images. While often offering superior
results over traditional techniques and successfully expressing complicated
patterns in data, deep architectures are known to be challenging to design and
train such that they generalize well to new data. Important issues with deep
architectures are numerical instabilities in derivative-based learning
algorithms commonly called exploding or vanishing gradients. In this paper we
propose new forward propagation techniques inspired by systems of Ordinary
Differential Equations (ODE) that overcome this challenge and lead to
well-posed learning problems for arbitrarily deep networks.
The backbone of our approach is our interpretation of deep learning as a
parameter estimation problem of nonlinear dynamical systems. Given this
formulation, we analyze stability and well-posedness of deep learning and use
this new understanding to develop new network architectures. We relate the
exploding and vanishing gradient phenomenon to the stability of the discrete
ODE and present several strategies for stabilizing deep learning for very deep
networks. While our new architectures restrict the solution space, several
numerical experiments show their competitiveness with state-of-the-art
networks.Comment: 23 pages, 7 figure
Learning in Modal Space: Solving Time-Dependent Stochastic PDEs Using Physics-Informed Neural Networks
One of the open problems in scientific computing is the long-time integration
of nonlinear stochastic partial differential equations (SPDEs). We address this
problem by taking advantage of recent advances in scientific machine learning
and the dynamically orthogonal (DO) and bi-orthogonal (BO) methods for
representing stochastic processes. Specifically, we propose two new
Physics-Informed Neural Networks (PINNs) for solving time-dependent SPDEs,
namely the NN-DO/BO methods, which incorporate the DO/BO constraints into the
loss function with an implicit form instead of generating explicit expressions
for the temporal derivatives of the DO/BO modes. Hence, the proposed methods
overcome some of the drawbacks of the original DO/BO methods: we do not need
the assumption that the covariance matrix of the random coefficients is
invertible as in the original DO method, and we can remove the assumption of no
eigenvalue crossing as in the original BO method. Moreover, the NN-DO/BO
methods can be used to solve time-dependent stochastic inverse problems with
the same formulation and computational complexity as for forward problems. We
demonstrate the capability of the proposed methods via several numerical
examples: (1) A linear stochastic advection equation with deterministic initial
condition where the original DO/BO method would fail; (2) Long-time integration
of the stochastic Burgers' equation with many eigenvalue crossings during the
whole time evolution where the original BO method fails. (3) Nonlinear reaction
diffusion equation: we consider both the forward and the inverse problem,
including noisy initial data, to investigate the flexibility of the NN-DO/BO
methods in handling inverse and mixed type problems. Taken together, these
simulation results demonstrate that the NN-DO/BO methods can be employed to
effectively quantify uncertainty propagation in a wide range of physical
problems
Lie Transform--based Neural Networks for Dynamics Simulation and Learning
In the article, we discuss the architecture of the polynomial neural network
that corresponds to the matrix representation of Lie transform. The matrix form
of Lie transform is an approximation of the general solution of the nonlinear
system of ordinary differential equations. The proposed architecture can be
trained with small data sets, extrapolate predictions outside the training
data, and provide a possibility for interpretation. We provide a theoretical
explanation of the proposed architecture, as well as demonstrate it in several
applications. We present the results of modeling and identification for both
simple and well-known dynamical systems, and more complicated examples from
price dynamics, chemistry, and accelerator physics. From a practical point of
view, we describe the training of a Lie transform--based neural network with a
small data set containing only 10 data points. We also demonstrate an
interpretation of the fitted neural network by converting it to a system of
differential equations.Comment: 12 pages, 7 figure
Solution of Definite Integrals using Functional Link Artificial Neural Networks
This paper discusses a new method to solve definite integrals using
artificial neural networks. The objective is to build a neural network that
would be a novel alternative to pre-established numerical methods and with the
help of a learning algorithm, be able to solve definite integrals, by
minimising a well constructed error function. The proposed algorithm, with
respect to existing numerical methods, is effective and precise and well-suited
for purposes which require integration of higher order polynomials. The
observations have been recorded and illustrated in tabular and graphical form.Comment: 14 pages, 7 figure
Orders-of-magnitude speedup in atmospheric chemistry modeling through neural network-based emulation
Chemical transport models (CTMs), which simulate air pollution transport,
transformation, and removal, are computationally expensive, largely because of
the computational intensity of the chemical mechanisms: systems of coupled
differential equations representing atmospheric chemistry. Here we investigate
the potential for machine learning to reproduce the behavior of a chemical
mechanism, yet with reduced computational expense. We create a 17-layer
residual multi-target regression neural network to emulate the Carbon Bond
Mechanism Z (CBM-Z) gas-phase chemical mechanism. We train the network to match
CBM-Z predictions of changes in concentrations of 77 chemical species after one
hour, given a range of chemical and meteorological input conditions, which it
is able to do with root-mean-square error (RMSE) of less than 1.97 ppb (median
RMSE = 0.02 ppb), while achieving a 250x computational speedup. An additional
17x speedup (total 4250x speedup) is achieved by running the neural network on
a graphics-processing unit (GPU). The neural network is able to reproduce the
emergent behavior of the chemical system over diurnal cycles using Euler
integration, but additional work is needed to constrain the propagation of
errors as simulation time progresses.Comment: Computer code for training the neural network emulator and the
trained model (CSV and python scripts) and supplemental text and figures
(PDF) are available by request by emailing corresponding author Christopher
Tessu
Pricing options and computing implied volatilities using neural networks
This paper proposes a data-driven approach, by means of an Artificial Neural
Network (ANN), to value financial options and to calculate implied volatilities
with the aim of accelerating the corresponding numerical methods. With ANNs
being universal function approximators, this method trains an optimized ANN on
a data set generated by a sophisticated financial model, and runs the trained
ANN as an agent of the original solver in a fast and efficient way. We test
this approach on three different types of solvers, including the analytic
solution for the Black-Scholes equation, the COS method for the Heston
stochastic volatility model and Brent's iterative root-finding method for the
calculation of implied volatilities. The numerical results show that the ANN
solver can reduce the computing time significantly
ANODE: Unconditionally Accurate Memory-Efficient Gradients for Neural ODEs
Residual neural networks can be viewed as the forward Euler discretization of
an Ordinary Differential Equation (ODE) with a unit time step. This has
recently motivated researchers to explore other discretization approaches and
train ODE based networks. However, an important challenge of neural ODEs is
their prohibitive memory cost during gradient backpropogation. Recently a
method proposed in [8], claimed that this memory overhead can be reduced from
O(LN_t), where N_t is the number of time steps, down to O(L) by solving forward
ODE backwards in time, where L is the depth of the network. However, we will
show that this approach may lead to several problems: (i) it may be numerically
unstable for ReLU/non-ReLU activations and general convolution operators, and
(ii) the proposed optimize-then-discretize approach may lead to divergent
training due to inconsistent gradients for small time step sizes. We discuss
the underlying problems, and to address them we propose ANODE, an Adjoint based
Neural ODE framework which avoids the numerical instability related problems
noted above, and provides unconditionally accurate gradients. ANODE has a
memory footprint of O(L) + O(N_t), with the same computational cost as
reversing ODE solve. We furthermore, discuss a memory efficient algorithm which
can further reduce this footprint with a trade-off of additional computational
cost. We show results on Cifar-10/100 datasets using ResNet and SqueezeNext
neural networks
Neural Network Gradient Hamiltonian Monte Carlo
Hamiltonian Monte Carlo is a widely used algorithm for sampling from
posterior distributions of complex Bayesian models. It can efficiently explore
high-dimensional parameter spaces guided by simulated Hamiltonian flows.
However, the algorithm requires repeated gradient calculations, and these
computations become increasingly burdensome as data sets scale. We present a
method to substantially reduce the computation burden by using a neural network
to approximate the gradient. First, we prove that the proposed method still
maintains convergence to the true distribution though the approximated gradient
no longer comes from a Hamiltonian system. Second, we conduct experiments on
synthetic examples and real data sets to validate the proposed method
- …