Search CORE

284 research outputs found

A Combined MPI-CUDA Parallel Solution of Linear and Nonlinear Poisson-Boltzmann Equation

Author
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2014
Field of study

Optimal convolution SOR acceleration of waveform relaxation with application to semiconductor device simulation

Author: Reichelt Mark
Publication venue
Publication date
Field of study

In this paper we describe a novel generalized SOR (successive overrelaxation) algorithm for accelerating the convergence of the dynamic iteration method known as waveform relaxation. A new convolution SOR algorithm is presented, along with a theorem for determining the optimal convolution SOR parameter. Both analytic and experimental results are given to demonstrate that the convergence of the convolution SOR algorithm is substantially faster than that of the more obvious frequency-independent waveform SOR algorithm. Finally, to demonstrate the general applicability of this new method, it is used to solve the differential-algebraic system generated by spatial discretization of the time-dependent semiconductor device equations

NASA Technical Reports Server

Distributing the Kalman Filter for Large-Scale Systems

Author: Khan Usman A.
Moura Jose M. F.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/02/2008
Field of study

This paper derives a \emph{distributed} Kalman filter to estimate a sparsely connected, large-scale,

n-

dimensional, dynamical system monitored by a network of

N

sensors. Local Kalman filters are implemented on the (

n_l-

dimensional, where

n_l\ll n

) sub-systems that are obtained after spatially decomposing the large-scale system. The resulting sub-systems overlap, which along with an assimilation procedure on the local Kalman filters, preserve an

L

th order Gauss-Markovian structure of the centralized error processes. The information loss due to the

L

th order Gauss-Markovian approximation is controllable as it can be characterized by a divergence that decreases as

L\uparrow

. The order of the approximation,

L

, leads to a lower bound on the dimension of the sub-systems, hence, providing a criterion for sub-system selection. The assimilation procedure is carried out on the local error covariances with a distributed iterate collapse inversion (DICI) algorithm that we introduce. The DICI algorithm computes the (approximated) centralized Riccati and Lyapunov equations iteratively with only local communication and low-order computation. We fuse the observations that are common among the local Kalman filters using bipartite fusion graphs and consensus averaging algorithms. The proposed algorithm achieves full distribution of the Kalman filter that is coherent with the centralized Kalman filter with an

L

th order Gaussian-Markovian structure on the centralized error processes. Nowhere storage, communication, or computation of

n-

dimensional vectors and matrices is needed; only

n_l \ll n

dimensional vectors and matrices are communicated or used in the computation at the sensors

arXiv.org e-Print Archive

Crossref

Improving Performance of Iterative Methods by Lossy Checkponting

Author: Acosta J. Mora
Agullo E.
Balay S.
Barrett R.
Barrett R.
Bode B.
Calhoun J.
Heath M. T.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 28/05/2018
Field of study

Iterative methods are commonly used approaches to solve large, sparse linear systems, which are fundamental operations for many modern scientific simulations. When the large-scale iterative methods are running with a large number of ranks in parallel, they have to checkpoint the dynamic variables periodically in case of unavoidable fail-stop errors, requiring fast I/O systems and large storage space. To this end, significantly reducing the checkpointing overhead is critical to improving the overall performance of iterative methods. Our contribution is fourfold. (1) We propose a novel lossy checkpointing scheme that can significantly improve the checkpointing performance of iterative methods by leveraging lossy compressors. (2) We formulate a lossy checkpointing performance model and derive theoretically an upper bound for the extra number of iterations caused by the distortion of data in lossy checkpoints, in order to guarantee the performance improvement under the lossy checkpointing scheme. (3) We analyze the impact of lossy checkpointing (i.e., extra number of iterations caused by lossy checkpointing files) for multiple types of iterative methods. (4)We evaluate the lossy checkpointing scheme with optimal checkpointing intervals on a high-performance computing environment with 2,048 cores, using a well-known scientific computation package PETSc and a state-of-the-art checkpoint/restart toolkit. Experiments show that our optimized lossy checkpointing scheme can significantly reduce the fault tolerance overhead for iterative methods by 23%~70% compared with traditional checkpointing and 20%~58% compared with lossless-compressed checkpointing, in the presence of system failures.Comment: 14 pages, 10 figures, HPDC'1

arXiv.org e-Print Archive

Crossref

Missouri University of Science and Technology (Missouri S&T): Scholars' Mine

A self-gravity module for the PLUTO code

Author: Mandal Ankush
Mignone Andrea
Mukherjee Dipanjan
Publication venue
Publication date: 01/01/2023
Field of study

We present a novel implementation of an iterative solver for the solution of the Poisson equation in the PLUTO code for astrophysical fluid dynamics. Our solver relies on a relaxation method in which convergence is sought as the steady-state solution of a parabolic equation, whose time-discretization is governed by the \textit{Runge-Kutta-Legendre} (RKL) method. Our findings indicate that the RKL-based Poisson solver, which is both fully parallel and rapidly convergent, has the potential to serve as a practical alternative to conventional iterative solvers such as the \textit{Gauss-Seidel} (GS) and \textit{successive over-relaxation} (SOR) methods. Additionally, it can mitigate some of the drawbacks of these traditional techniques. We incorporate our algorithm into a multigrid solver to provide a simple and efficient gravity solver that can be used to obtain the gravitational potentials in self-gravitational hydrodynamics. We test our implementation against a broad range of standard self-gravitating astrophysical problems designed to examine different aspects of the code. We demonstrate that the results match excellently with the analytical predictions (when available), and the findings of similar previous studies.Comment: Submitted to ApJS. Comments are welcom

arXiv.org e-Print Archive

Directory of Open Access Journals

Parallel realization of the finite difference method solution of the Poisson-Boltzmann equation

Author: Ayrjan Edik A.
Hayryan Shura
Hu Chin-Kun
Pokorný Imrich
Puzynin Igor V.
Publication venue: 'Masaryk University Press'
Publication date: 01/01/2001
Field of study

Institute of Mathematics AS CR, v. v. i.

An Objective Analysis Technique for Constructing Three-Dimensional Urban-Scale Wind Fields

Author: Goodin William R.
McRae Gregory J.
Seinfeld John H.
Publication venue: 'American Meteorological Society'
Publication date: 01/01/1980
Field of study

An objective analysis procedure for generating mass-consistent, urban-scale three-dimensional wind fields is presented together with a comparison against existing techniques. The algorithm employs terrain following coordinates and variable vertical grid spacing. Initial estimates of the velocity field are developed by interpolating surface and upper level wind measurements. A local terrain adjustment technique, involving solution of the Poisson equation, is used to establish the horizontal components of the surface field. Vertical velocities are developed from successive solutions of the continuity equation followed by an iterative procedure which reduces anomalous divergence in the complete field. Major advantages of the procedure are that it is computationally efficient and allows boundary values to adjust in response to changes in the interior flow. The method has been successfully tested using field measurements and problems with known analytic solutions

Caltech Authors

Online Research Database In Technology