6,670 research outputs found

    Directed Transmission Method, A Fully Asynchronous approach to Solve Sparse Linear Systems in Parallel

    Full text link
    In this paper, we propose a new distributed algorithm, called Directed Transmission Method (DTM). DTM is a fully asynchronous and continuous-time iterative algorithm to solve SPD sparse linear system. As an architecture-aware algorithm, DTM could be freely running on all kinds of heterogeneous parallel computer. We proved that DTM is convergent by making use of the final-value theorem of Laplacian Transformation. Numerical experiments show that DTM is stable and efficient.Comment: v1: poster presented in SPAA'08; v2: full paper; v3: rename EVS to GNBT; v4: reuse EVS. More info, see my web page at http://weifei00.googlepages.co

    Improving Performance of Iterative Methods by Lossy Checkponting

    Get PDF
    Iterative methods are commonly used approaches to solve large, sparse linear systems, which are fundamental operations for many modern scientific simulations. When the large-scale iterative methods are running with a large number of ranks in parallel, they have to checkpoint the dynamic variables periodically in case of unavoidable fail-stop errors, requiring fast I/O systems and large storage space. To this end, significantly reducing the checkpointing overhead is critical to improving the overall performance of iterative methods. Our contribution is fourfold. (1) We propose a novel lossy checkpointing scheme that can significantly improve the checkpointing performance of iterative methods by leveraging lossy compressors. (2) We formulate a lossy checkpointing performance model and derive theoretically an upper bound for the extra number of iterations caused by the distortion of data in lossy checkpoints, in order to guarantee the performance improvement under the lossy checkpointing scheme. (3) We analyze the impact of lossy checkpointing (i.e., extra number of iterations caused by lossy checkpointing files) for multiple types of iterative methods. (4)We evaluate the lossy checkpointing scheme with optimal checkpointing intervals on a high-performance computing environment with 2,048 cores, using a well-known scientific computation package PETSc and a state-of-the-art checkpoint/restart toolkit. Experiments show that our optimized lossy checkpointing scheme can significantly reduce the fault tolerance overhead for iterative methods by 23%~70% compared with traditional checkpointing and 20%~58% compared with lossless-compressed checkpointing, in the presence of system failures.Comment: 14 pages, 10 figures, HPDC'1

    An odyssey into local refinement and multilevel preconditioning III: Implementation and numerical experiments

    Get PDF
    In this paper, we examine a number of additive and multiplicative multilevel iterative methods and preconditioners in the setting of two-dimensional local mesh refinement. While standard multilevel methods are effective for uniform refinement-based discretizations of elliptic equations, they tend to be less effective for algebraic systems, which arise from discretizations on locally refined meshes, losing their optimal behavior in both storage and computational complexity. Our primary focus here is on Bramble, Pasciak, and Xu (BPX)-style additive and multiplicative multilevel preconditioners, and on various stabilizations of the additive and multiplicative hierarchical basis (HB) method, and their use in the local mesh refinement setting. In parts I and II of this trilogy, it was shown that both BPX and wavelet stabilizations of HB have uniformly bounded condition numbers on several classes of locally refined two- and three-dimensional meshes based on fairly standard (and easily implementable) red and red-green mesh refinement algorithms. In this third part of the trilogy, we describe in detail the implementation of these types of algorithms, including detailed discussions of the data structures and traversal algorithms we employ for obtaining optimal storage and computational complexity in our implementations. We show how each of the algorithms can be implemented using standard data types, available in languages such as C and FORTRAN, so that the resulting algorithms have optimal (linear) storage requirements, and so that the resulting multilevel method or preconditioner can be applied with optimal (linear) computational costs. We have successfully used these data structure ideas for both MATLAB and C implementations using the FEtk, an open source finite element software package. We finish the paper with a sequence of numerical experiments illustrating the effectiveness of a number of BPX and stabilized HB variants for several examples requiring local refinement

    Recent Advances in Graph Partitioning

    Full text link
    We survey recent trends in practical algorithms for balanced graph partitioning together with applications and future research directions

    Natural preconditioners for saddle point systems

    Get PDF
    The solution of quadratic or locally quadratic extremum problems subject to linear(ized) constraints gives rise to linear systems in saddle point form. This is true whether in the continuous or discrete setting, so saddle point systems arising from discretization of partial differential equation problems such as those describing electromagnetic problems or incompressible flow lead to equations with this structure as does, for example, the widely used sequential quadratic programming approach to nonlinear optimization.\ud This article concerns iterative solution methods for these problems and in particular shows how the problem formulation leads to natural preconditioners which guarantee rapid convergence of the relevant iterative methods. These preconditioners are related to the original extremum problem and their effectiveness -- in terms of rapidity of convergence -- is established here via a proof of general bounds on the eigenvalues of the preconditioned saddle point matrix on which iteration convergence depends

    Hierarchical Schur complement preconditioner for the stochastic Galerkin finite element methods

    Full text link
    Use of the stochastic Galerkin finite element methods leads to large systems of linear equations obtained by the discretization of tensor product solution spaces along their spatial and stochastic dimensions. These systems are typically solved iteratively by a Krylov subspace method. We propose a preconditioner which takes an advantage of the recursive hierarchy in the structure of the global matrices. In particular, the matrices posses a recursive hierarchical two-by-two structure, with one of the submatrices block diagonal. Each one of the diagonal blocks in this submatrix is closely related to the deterministic mean-value problem, and the action of its inverse is in the implementation approximated by inner loops of Krylov iterations. Thus our hierarchical Schur complement preconditioner combines, on each level in the approximation of the hierarchical structure of the global matrix, the idea of Schur complement with loops for a number of mutually independent inner Krylov iterations, and several matrix-vector multiplications for the off-diagonal blocks. Neither the global matrix, nor the matrix of the preconditioner need to be formed explicitly. The ingredients include only the number of stiffness matrices from the truncated Karhunen-Lo\`{e}ve expansion and a good preconditioned for the mean-value deterministic problem. We provide a condition number bound for a model elliptic problem and the performance of the method is illustrated by numerical experiments.Comment: 15 pages, 2 figures, 9 tables, (updated numerical experiments
    • …
    corecore