6,670 research outputs found
Directed Transmission Method, A Fully Asynchronous approach to Solve Sparse Linear Systems in Parallel
In this paper, we propose a new distributed algorithm, called Directed
Transmission Method (DTM). DTM is a fully asynchronous and continuous-time
iterative algorithm to solve SPD sparse linear system. As an architecture-aware
algorithm, DTM could be freely running on all kinds of heterogeneous parallel
computer. We proved that DTM is convergent by making use of the final-value
theorem of Laplacian Transformation. Numerical experiments show that DTM is
stable and efficient.Comment: v1: poster presented in SPAA'08; v2: full paper; v3: rename EVS to
GNBT; v4: reuse EVS. More info, see my web page at
http://weifei00.googlepages.co
Improving Performance of Iterative Methods by Lossy Checkponting
Iterative methods are commonly used approaches to solve large, sparse linear
systems, which are fundamental operations for many modern scientific
simulations. When the large-scale iterative methods are running with a large
number of ranks in parallel, they have to checkpoint the dynamic variables
periodically in case of unavoidable fail-stop errors, requiring fast I/O
systems and large storage space. To this end, significantly reducing the
checkpointing overhead is critical to improving the overall performance of
iterative methods. Our contribution is fourfold. (1) We propose a novel lossy
checkpointing scheme that can significantly improve the checkpointing
performance of iterative methods by leveraging lossy compressors. (2) We
formulate a lossy checkpointing performance model and derive theoretically an
upper bound for the extra number of iterations caused by the distortion of data
in lossy checkpoints, in order to guarantee the performance improvement under
the lossy checkpointing scheme. (3) We analyze the impact of lossy
checkpointing (i.e., extra number of iterations caused by lossy checkpointing
files) for multiple types of iterative methods. (4)We evaluate the lossy
checkpointing scheme with optimal checkpointing intervals on a high-performance
computing environment with 2,048 cores, using a well-known scientific
computation package PETSc and a state-of-the-art checkpoint/restart toolkit.
Experiments show that our optimized lossy checkpointing scheme can
significantly reduce the fault tolerance overhead for iterative methods by
23%~70% compared with traditional checkpointing and 20%~58% compared with
lossless-compressed checkpointing, in the presence of system failures.Comment: 14 pages, 10 figures, HPDC'1
An odyssey into local refinement and multilevel preconditioning III: Implementation and numerical experiments
In this paper, we examine a number of additive and multiplicative multilevel iterative methods and preconditioners in the setting of two-dimensional local mesh refinement. While standard multilevel methods are effective for uniform refinement-based discretizations of elliptic equations, they tend to be less effective for algebraic systems, which arise from discretizations on locally refined meshes, losing their optimal behavior in both storage and computational complexity. Our primary focus here is on Bramble, Pasciak, and Xu (BPX)-style additive and multiplicative multilevel preconditioners, and on various stabilizations of the additive and multiplicative hierarchical basis (HB) method, and their use in the local mesh refinement setting. In parts I and II of this trilogy, it was shown that both BPX and wavelet stabilizations of HB have uniformly bounded condition numbers on several classes of locally refined two- and three-dimensional meshes based on fairly standard (and easily implementable) red and red-green mesh refinement algorithms. In this third part of the trilogy, we describe in detail the implementation of these types of algorithms, including detailed discussions of the data structures and traversal algorithms we employ for obtaining optimal storage and computational complexity in our implementations. We show how each of the algorithms can be implemented using standard data types, available in languages such as C and FORTRAN, so that the resulting algorithms have optimal (linear) storage requirements, and so that the resulting multilevel method or preconditioner can be applied with optimal (linear) computational costs. We have successfully used these data structure ideas for both MATLAB and C implementations using the FEtk, an open source finite element software package. We finish the paper with a sequence of numerical experiments illustrating the effectiveness of a number of BPX and stabilized HB variants for several examples requiring local refinement
Recent Advances in Graph Partitioning
We survey recent trends in practical algorithms for balanced graph
partitioning together with applications and future research directions
Natural preconditioners for saddle point systems
The solution of quadratic or locally quadratic extremum problems subject to linear(ized) constraints gives rise to linear systems in saddle point form. This is true whether in the continuous or discrete setting, so saddle point systems arising from discretization of partial differential equation problems such as those describing electromagnetic problems or incompressible flow lead to equations with this structure as does, for example, the widely used sequential quadratic programming approach to nonlinear optimization.\ud
This article concerns iterative solution methods for these problems and in particular shows how the problem formulation leads to natural preconditioners which guarantee rapid convergence of the relevant iterative methods. These preconditioners are related to the original extremum problem and their effectiveness -- in terms of rapidity of convergence -- is established here via a proof of general bounds on the eigenvalues of the preconditioned saddle point matrix on which iteration convergence depends
Hierarchical Schur complement preconditioner for the stochastic Galerkin finite element methods
Use of the stochastic Galerkin finite element methods leads to large systems
of linear equations obtained by the discretization of tensor product solution
spaces along their spatial and stochastic dimensions. These systems are
typically solved iteratively by a Krylov subspace method. We propose a
preconditioner which takes an advantage of the recursive hierarchy in the
structure of the global matrices. In particular, the matrices posses a
recursive hierarchical two-by-two structure, with one of the submatrices block
diagonal. Each one of the diagonal blocks in this submatrix is closely related
to the deterministic mean-value problem, and the action of its inverse is in
the implementation approximated by inner loops of Krylov iterations. Thus our
hierarchical Schur complement preconditioner combines, on each level in the
approximation of the hierarchical structure of the global matrix, the idea of
Schur complement with loops for a number of mutually independent inner Krylov
iterations, and several matrix-vector multiplications for the off-diagonal
blocks. Neither the global matrix, nor the matrix of the preconditioner need to
be formed explicitly. The ingredients include only the number of stiffness
matrices from the truncated Karhunen-Lo\`{e}ve expansion and a good
preconditioned for the mean-value deterministic problem. We provide a condition
number bound for a model elliptic problem and the performance of the method is
illustrated by numerical experiments.Comment: 15 pages, 2 figures, 9 tables, (updated numerical experiments
- …