284 research outputs found
Optimal convolution SOR acceleration of waveform relaxation with application to semiconductor device simulation
In this paper we describe a novel generalized SOR (successive overrelaxation) algorithm for accelerating the convergence of the dynamic iteration method known as waveform relaxation. A new convolution SOR algorithm is presented, along with a theorem for determining the optimal convolution SOR parameter. Both analytic and experimental results are given to demonstrate that the convergence of the convolution SOR algorithm is substantially faster than that of the more obvious frequency-independent waveform SOR algorithm. Finally, to demonstrate the general applicability of this new method, it is used to solve the differential-algebraic system generated by spatial discretization of the time-dependent semiconductor device equations
Distributing the Kalman Filter for Large-Scale Systems
This paper derives a \emph{distributed} Kalman filter to estimate a sparsely
connected, large-scale, dimensional, dynamical system monitored by a
network of sensors. Local Kalman filters are implemented on the
(dimensional, where ) sub-systems that are obtained after
spatially decomposing the large-scale system. The resulting sub-systems
overlap, which along with an assimilation procedure on the local Kalman
filters, preserve an th order Gauss-Markovian structure of the centralized
error processes. The information loss due to the th order Gauss-Markovian
approximation is controllable as it can be characterized by a divergence that
decreases as . The order of the approximation, , leads to a lower
bound on the dimension of the sub-systems, hence, providing a criterion for
sub-system selection. The assimilation procedure is carried out on the local
error covariances with a distributed iterate collapse inversion (DICI)
algorithm that we introduce. The DICI algorithm computes the (approximated)
centralized Riccati and Lyapunov equations iteratively with only local
communication and low-order computation. We fuse the observations that are
common among the local Kalman filters using bipartite fusion graphs and
consensus averaging algorithms. The proposed algorithm achieves full
distribution of the Kalman filter that is coherent with the centralized Kalman
filter with an th order Gaussian-Markovian structure on the centralized
error processes. Nowhere storage, communication, or computation of
dimensional vectors and matrices is needed; only dimensional
vectors and matrices are communicated or used in the computation at the
sensors
Improving Performance of Iterative Methods by Lossy Checkponting
Iterative methods are commonly used approaches to solve large, sparse linear
systems, which are fundamental operations for many modern scientific
simulations. When the large-scale iterative methods are running with a large
number of ranks in parallel, they have to checkpoint the dynamic variables
periodically in case of unavoidable fail-stop errors, requiring fast I/O
systems and large storage space. To this end, significantly reducing the
checkpointing overhead is critical to improving the overall performance of
iterative methods. Our contribution is fourfold. (1) We propose a novel lossy
checkpointing scheme that can significantly improve the checkpointing
performance of iterative methods by leveraging lossy compressors. (2) We
formulate a lossy checkpointing performance model and derive theoretically an
upper bound for the extra number of iterations caused by the distortion of data
in lossy checkpoints, in order to guarantee the performance improvement under
the lossy checkpointing scheme. (3) We analyze the impact of lossy
checkpointing (i.e., extra number of iterations caused by lossy checkpointing
files) for multiple types of iterative methods. (4)We evaluate the lossy
checkpointing scheme with optimal checkpointing intervals on a high-performance
computing environment with 2,048 cores, using a well-known scientific
computation package PETSc and a state-of-the-art checkpoint/restart toolkit.
Experiments show that our optimized lossy checkpointing scheme can
significantly reduce the fault tolerance overhead for iterative methods by
23%~70% compared with traditional checkpointing and 20%~58% compared with
lossless-compressed checkpointing, in the presence of system failures.Comment: 14 pages, 10 figures, HPDC'1
A self-gravity module for the PLUTO code
We present a novel implementation of an iterative solver for the solution of
the Poisson equation in the PLUTO code for astrophysical fluid dynamics. Our
solver relies on a relaxation method in which convergence is sought as the
steady-state solution of a parabolic equation, whose time-discretization is
governed by the \textit{Runge-Kutta-Legendre} (RKL) method. Our findings
indicate that the RKL-based Poisson solver, which is both fully parallel and
rapidly convergent, has the potential to serve as a practical alternative to
conventional iterative solvers such as the \textit{Gauss-Seidel} (GS) and
\textit{successive over-relaxation} (SOR) methods. Additionally, it can
mitigate some of the drawbacks of these traditional techniques. We incorporate
our algorithm into a multigrid solver to provide a simple and efficient gravity
solver that can be used to obtain the gravitational potentials in
self-gravitational hydrodynamics. We test our implementation against a broad
range of standard self-gravitating astrophysical problems designed to examine
different aspects of the code. We demonstrate that the results match
excellently with the analytical predictions (when available), and the findings
of similar previous studies.Comment: Submitted to ApJS. Comments are welcom
An Objective Analysis Technique for Constructing Three-Dimensional Urban-Scale Wind Fields
An objective analysis procedure for generating mass-consistent, urban-scale three-dimensional wind fields is presented together with a comparison against existing techniques. The algorithm employs terrain following coordinates and variable vertical grid spacing. Initial estimates of the velocity field are developed by interpolating surface and upper level wind measurements. A local terrain adjustment technique, involving solution of the Poisson equation, is used to establish the horizontal components of the surface field. Vertical velocities are developed from successive solutions of the continuity equation followed by an iterative procedure which reduces anomalous divergence in the complete field. Major advantages of the procedure are that it is computationally efficient and allows boundary values to adjust in response to changes in the interior flow. The method has been successfully tested using field measurements and problems with known analytic solutions
- …