Search CORE

37,912 research outputs found

Parareal with a physics-informed neural network as coarse propagator

Author: Götschel Sebastian
Ibrahim Abdul Qadir
Ruprecht Daniel
Publication venue
Publication date: 05/06/2023
Field of study

Parallel-in-time algorithms provide an additional layer of concurrency for the numerical integration of models based on time-dependent differential equations. Methods like Parareal, which parallelize across multiple time steps, rely on a computationally cheap and coarse integrator to propagate information forward in time, while a parallelizable expensive fine propagator provides accuracy. Typically, the coarse method is a numerical integrator using lower resolution, reduced order or a simplified model. Our paper proposes to use a physics-informed neural network (PINN) instead. We demonstrate for the Black-Scholes equation, a partial differential equation from computational finance, that Parareal with a PINN coarse propagator provides better speedup than a numerical coarse propagator. Training and evaluating a neural network are both tasks whose computing patterns are well suited for GPUs. By contrast, mesh-based algorithms with their low computational intensity struggle to perform well. We show that moving the coarse propagator PINN to a GPU while running the numerical fine propagator on the CPU further improves Parareal's single-node performance. This suggests that integrating machine learning techniques into parallel-in-time integration methods and exploiting their differences in computing patterns might offer a way to better utilize heterogeneous architectures.Comment: 13 pages, 7 figure

arXiv.org e-Print Archive

Learning data driven discretizations for partial differential equations

Author: Bar-Sinai Yohai
Brenner Michael P.
Hickey Jason
Hoyer Stephan
Publication venue: 'Proceedings of the National Academy of Sciences'
Publication date: 25/06/2019
Field of study

The numerical solution of partial differential equations (PDEs) is challenging because of the need to resolve spatiotemporal features over wide length and timescales. Often, it is computationally intractable to resolve the finest features in the solution. The only recourse is to use approximate coarse-grained representations, which aim to accurately represent long-wavelength dynamics while properly accounting for unresolved small scale physics. Deriving such coarse grained equations is notoriously difficult, and often \emph{ad hoc}. Here we introduce \emph{data driven discretization}, a method for learning optimized approximations to PDEs based on actual solutions to the known underlying equations. Our approach uses neural networks to estimate spatial derivatives, which are optimized end-to-end to best satisfy the equations on a low resolution grid. The resulting numerical methods are remarkably accurate, allowing us to integrate in time a collection of nonlinear equations in one spatial dimension at resolutions 4-8x coarser than is possible with standard finite difference methods.Comment: YBS and SH contributed equally to this work. 7 pages, 4 figures (+ Appendix: 9 pages, 10 figures

arXiv.org e-Print Archive

Stable Architectures for Deep Neural Networks

Author: Haber Eldad
Ruthotto Lars
Publication venue: 'IOP Publishing'
Publication date: 16/02/2019
Field of study

Deep neural networks have become invaluable tools for supervised machine learning, e.g., classification of text or images. While often offering superior results over traditional techniques and successfully expressing complicated patterns in data, deep architectures are known to be challenging to design and train such that they generalize well to new data. Important issues with deep architectures are numerical instabilities in derivative-based learning algorithms commonly called exploding or vanishing gradients. In this paper we propose new forward propagation techniques inspired by systems of Ordinary Differential Equations (ODE) that overcome this challenge and lead to well-posed learning problems for arbitrarily deep networks. The backbone of our approach is our interpretation of deep learning as a parameter estimation problem of nonlinear dynamical systems. Given this formulation, we analyze stability and well-posedness of deep learning and use this new understanding to develop new network architectures. We relate the exploding and vanishing gradient phenomenon to the stability of the discrete ODE and present several strategies for stabilizing deep learning for very deep networks. While our new architectures restrict the solution space, several numerical experiments show their competitiveness with state-of-the-art networks.Comment: 23 pages, 7 figure

arXiv.org e-Print Archive

Learning in Modal Space: Solving Time-Dependent Stochastic PDEs Using Physics-Informed Neural Networks

Author: Guo Ling
Karniadakis George Em
Zhang Dongkun
Publication venue
Publication date: 03/09/2019
Field of study

One of the open problems in scientific computing is the long-time integration of nonlinear stochastic partial differential equations (SPDEs). We address this problem by taking advantage of recent advances in scientific machine learning and the dynamically orthogonal (DO) and bi-orthogonal (BO) methods for representing stochastic processes. Specifically, we propose two new Physics-Informed Neural Networks (PINNs) for solving time-dependent SPDEs, namely the NN-DO/BO methods, which incorporate the DO/BO constraints into the loss function with an implicit form instead of generating explicit expressions for the temporal derivatives of the DO/BO modes. Hence, the proposed methods overcome some of the drawbacks of the original DO/BO methods: we do not need the assumption that the covariance matrix of the random coefficients is invertible as in the original DO method, and we can remove the assumption of no eigenvalue crossing as in the original BO method. Moreover, the NN-DO/BO methods can be used to solve time-dependent stochastic inverse problems with the same formulation and computational complexity as for forward problems. We demonstrate the capability of the proposed methods via several numerical examples: (1) A linear stochastic advection equation with deterministic initial condition where the original DO/BO method would fail; (2) Long-time integration of the stochastic Burgers' equation with many eigenvalue crossings during the whole time evolution where the original BO method fails. (3) Nonlinear reaction diffusion equation: we consider both the forward and the inverse problem, including noisy initial data, to investigate the flexibility of the NN-DO/BO methods in handling inverse and mixed type problems. Taken together, these simulation results demonstrate that the NN-DO/BO methods can be employed to effectively quantify uncertainty propagation in a wide range of physical problems

arXiv.org e-Print Archive

Lie Transform--based Neural Networks for Dynamics Simulation and Learning

Author: Andrianov Sergei
Ivanov Andrei
Konoplev-Esgenburg Roman
Sholokhova Alena
Publication venue
Publication date: 16/08/2019
Field of study

In the article, we discuss the architecture of the polynomial neural network that corresponds to the matrix representation of Lie transform. The matrix form of Lie transform is an approximation of the general solution of the nonlinear system of ordinary differential equations. The proposed architecture can be trained with small data sets, extrapolate predictions outside the training data, and provide a possibility for interpretation. We provide a theoretical explanation of the proposed architecture, as well as demonstrate it in several applications. We present the results of modeling and identification for both simple and well-known dynamical systems, and more complicated examples from price dynamics, chemistry, and accelerator physics. From a practical point of view, we describe the training of a Lie transform--based neural network with a small data set containing only 10 data points. We also demonstrate an interpretation of the fitted neural network by converting it to a system of differential equations.Comment: 12 pages, 7 figure

arXiv.org e-Print Archive

Solution of Definite Integrals using Functional Link Artificial Neural Networks

Author: Bhattacharjee Snehangshu
Changdar Satyasaran
Publication venue
Publication date: 21/04/2019
Field of study

This paper discusses a new method to solve definite integrals using artificial neural networks. The objective is to build a neural network that would be a novel alternative to pre-established numerical methods and with the help of a learning algorithm, be able to solve definite integrals, by minimising a well constructed error function. The proposed algorithm, with respect to existing numerical methods, is effective and precise and well-suited for purposes which require integration of higher order polynomials. The observations have been recorded and illustrated in tabular and graphical form.Comment: 14 pages, 7 figure

arXiv.org e-Print Archive

Orders-of-magnitude speedup in atmospheric chemistry modeling through neural network-based emulation

Author: Kelp Makoto M.
Marshall Julian D.
Tessum Christopher W.
Publication venue
Publication date: 11/08/2018
Field of study

Chemical transport models (CTMs), which simulate air pollution transport, transformation, and removal, are computationally expensive, largely because of the computational intensity of the chemical mechanisms: systems of coupled differential equations representing atmospheric chemistry. Here we investigate the potential for machine learning to reproduce the behavior of a chemical mechanism, yet with reduced computational expense. We create a 17-layer residual multi-target regression neural network to emulate the Carbon Bond Mechanism Z (CBM-Z) gas-phase chemical mechanism. We train the network to match CBM-Z predictions of changes in concentrations of 77 chemical species after one hour, given a range of chemical and meteorological input conditions, which it is able to do with root-mean-square error (RMSE) of less than 1.97 ppb (median RMSE = 0.02 ppb), while achieving a 250x computational speedup. An additional 17x speedup (total 4250x speedup) is achieved by running the neural network on a graphics-processing unit (GPU). The neural network is able to reproduce the emergent behavior of the chemical system over diurnal cycles using Euler integration, but additional work is needed to constrain the propagation of errors as simulation time progresses.Comment: Computer code for training the neural network emulator and the trained model (CSV and python scripts) and supplemental text and figures (PDF) are available by request by emailing corresponding author Christopher Tessu

arXiv.org e-Print Archive

Pricing options and computing implied volatilities using neural networks

Author: Bohte Sander M.
Liu Shuaiqiang
Oosterlee Cornelis W.
Publication venue: 'MDPI AG'
Publication date: 01/02/2019
Field of study

This paper proposes a data-driven approach, by means of an Artificial Neural Network (ANN), to value financial options and to calculate implied volatilities with the aim of accelerating the corresponding numerical methods. With ANNs being universal function approximators, this method trains an optimized ANN on a data set generated by a sophisticated financial model, and runs the trained ANN as an agent of the original solver in a fast and efficient way. We test this approach on three different types of solvers, including the analytic solution for the Black-Scholes equation, the COS method for the Heston stochastic volatility model and Brent's iterative root-finding method for the calculation of implied volatilities. The numerical results show that the ANN solver can reduce the computing time significantly

arXiv.org e-Print Archive

Directory of Open Access Journals

ANODE: Unconditionally Accurate Memory-Efficient Gradients for Neural ODEs

Author: Biros George
Gholami Amir
Keutzer Kurt
Publication venue
Publication date: 01/07/2019
Field of study

Residual neural networks can be viewed as the forward Euler discretization of an Ordinary Differential Equation (ODE) with a unit time step. This has recently motivated researchers to explore other discretization approaches and train ODE based networks. However, an important challenge of neural ODEs is their prohibitive memory cost during gradient backpropogation. Recently a method proposed in [8], claimed that this memory overhead can be reduced from O(LN_t), where N_t is the number of time steps, down to O(L) by solving forward ODE backwards in time, where L is the depth of the network. However, we will show that this approach may lead to several problems: (i) it may be numerically unstable for ReLU/non-ReLU activations and general convolution operators, and (ii) the proposed optimize-then-discretize approach may lead to divergent training due to inconsistent gradients for small time step sizes. We discuss the underlying problems, and to address them we propose ANODE, an Adjoint based Neural ODE framework which avoids the numerical instability related problems noted above, and provides unconditionally accurate gradients. ANODE has a memory footprint of O(L) + O(N_t), with the same computational cost as reversing ODE solve. We furthermore, discuss a memory efficient algorithm which can further reduce this footprint with a trade-off of additional computational cost. We show results on Cifar-10/100 datasets using ResNet and SqueezeNext neural networks

arXiv.org e-Print Archive

Neural Network Gradient Hamiltonian Monte Carlo

Author: Baldi Pierre
Holbrook Andrew
Li Lingge
Shahbaba Babak
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 25/10/2018
Field of study

Hamiltonian Monte Carlo is a widely used algorithm for sampling from posterior distributions of complex Bayesian models. It can efficiently explore high-dimensional parameter spaces guided by simulated Hamiltonian flows. However, the algorithm requires repeated gradient calculations, and these computations become increasingly burdensome as data sets scale. We present a method to substantially reduce the computation burden by using a neural network to approximate the gradient. First, we prove that the proposed method still maintains convergence to the true distribution though the approximated gradient no longer comes from a Hamiltonian system. Second, we conduct experiments on synthetic examples and real data sets to validate the proposed method

arXiv.org e-Print Archive

eScholarship - University of California