460 research outputs found

    Parallel pivoting combined with parallel reduction

    Get PDF
    Parallel algorithms for triangularization of large, sparse, and unsymmetric matrices are presented. The method combines the parallel reduction with a new parallel pivoting technique, control over generations of fill-ins and a check for numerical stability, all done in parallel with the work being distributed over the active processes. The parallel technique uses the compatibility relation between pivots to identify parallel pivot candidates and uses the Markowitz number of pivots to minimize fill-in. This technique is not a preordering of the sparse matrix and is applied dynamically as the decomposition proceeds

    Performance Improvements of Common Sparse Numerical Linear Algebra Computations

    Get PDF
    Manufacturers of computer hardware are able to continuously sustain an unprecedented pace of progress in computing speed of their products, partially due to increased clock rates but also because of ever more complicated chip designs. With new processor families appearing every few years, it is increasingly harder to achieve high performance rates in sparse matrix computations. This research proposes new methods for sparse matrix factorizations and applies in an iterative code generalizations of known concepts from related disciplines. The proposed solutions and extensions are implemented in ways that tend to deliver efficiency while retaining ease of use of existing solutions. The implementations are thoroughly timed and analyzed using a commonly accepted set of test matrices. The tests were conducted on modern processors that seem to have gained an appreciable level of popularity and are fairly representative for a wider range of processor types that are available on the market now or in the near future. The new factorization technique formally introduced in the early chapters is later on proven to be quite competitive with state of the art software currently available. Although not totally superior in all cases (as probably no single approach could possibly be), the new factorization algorithm exhibits a few promising features. In addition, an all-embracing optimization effort is applied to an iterative algorithm that stands out for its robustness. This also gives satisfactory results on the tested computing platforms in terms of performance improvement. The same set of test matrices is used to enable an easy comparison between both investigated techniques, even though they are customarily treated separately in the literature. Possible extensions of the presented work are discussed. They range from easily conceivable merging with existing solutions to rather more evolved schemes dependent on hard to predict progress in theoretical and algorithmic research

    The design and use of a sparse direct solver for skew symmetric matrices

    Get PDF
    AbstractWe consider the LDLT factorization of sparse skew symmetric matrices. We see that the pivoting strategies are similar, but simpler, to those used in the factorization of sparse symmetric indefinite matrices, and we briefly describe the algorithms used in a forthcoming direct code based on multifrontal techniques for the factorization of real skew symmetric matrices. We show how this factorization can be very efficient for preconditioning matrices that have a large skew component

    LU Factorization of Sparse, Unsymmetric Jacobian Matrices on Multicomputers: Experience, Strategies, Performance

    Get PDF
    Efficient sparse linear algebra cannot be achieved as a straightforward extension of the dense case, even for concurrent implementations. This paper details a new, general-purpose unsymmetric sparse LU factorization code built on the philosophy of Harwell’s MA28, with variations. We apply this code in the framework of Jacobian-matrix factorizations, arising from Newton iterations in the solution of nonlinear systems of equations. Serious attention has been paid to the data-structure requirements, complexity issues and communication features of the algorithm. Key results include reduced communication pivoting for both the “analyze” A-mode and repeated B-mode factorizations, and effective general-purpose data distributions useful incrementally to trade-off process-column load balance in factorization against triangular solve performance. Future planned efforts are cited in conclusion

    Sparse matrix based power flow solver for real-time simulation of power system

    Get PDF
    Analyzing a massive number of Power Flow (PF) equations even on almost identical or similar network topology is a highly time-consuming process for large-scale power systems. The major computation time is hoarded by the iterative linear solving process to solve nonlinear equations until convergence is achieved. This is a paramount concern for any PF analysis methods. This thesis presents a sparse matrix-based power flow solver that is fast enough to be implemented in the real-time analysis of largescale power systems. It uses KLU, a sparse matrix solver, for PF analysis. It also implements parallel processing of CPU and GPU which enables the simultaneous computation of multiple blocks in the algorithm leading to faster execution. It runs 1000 times and 200 times faster than newton raphson method for DC and AC power system respectively. On average, it is around 10 times faster than MATPOWER for both AC and DC power system

    Sparse matrix based power flow solver for real-time simulation of power system

    Get PDF
    Analyzing a massive number of Power Flow (PF) equations even on almost identical or similar network topology is a highly time-consuming process for large-scale power systems. The major computation time is hoarded by the iterative linear solving process to solve nonlinear equations until convergence is achieved. This is a paramount concern for any PF analysis methods. This thesis presents a sparse matrix-based power flow solver that is fast enough to be implemented in the real-time analysis of largescale power systems. It uses KLU, a sparse matrix solver, for PF analysis. It also implements parallel processing of CPU and GPU which enables the simultaneous computation of multiple blocks in the algorithm leading to faster execution. It runs 1000 times and 200 times faster than newton raphson method for DC and AC power system respectively. On average, it is around 10 times faster than MATPOWER for both AC and DC power system

    Data Structures and Algorithms for Efficient Solution of Simultaneous Linear Equations from 3-D Ice Sheet Models

    Get PDF
    Two current software packages for solving large systems of sparse simultaneous l~neare equations are evaluated in terms of their applicability to solving systems of equations generated by the University of Maine Ice Sheet Model. SuperLU, the first package, has been developed by researchers at the University of California at Berkeley and the Lawrence Berkeley National Laboratory. UMFPACK, the second package, has been developed by T. A. Davis of the University of Florida who has ties with the U. C. Berkeley researchers as well as European researchers. Both packages are direct solvers that use LU factorization with forward and backward substitution. The University of Maine Ice Sheet Model uses the finite element method to solve partial differential equations that describe ice thickness, velocity,and temperature throughout glaciers as functions of position and t~me. The finite element method generates systems of linear equations having tens of thousands of variables and one hundred or so non-zero coefficients per equation. Matrices representing these systems of equations may be strictly banded or banded with right and lower borders. In order to efficiently Interface the software packages with the ice sheet model, a modified compressed column data structure and supporting routines were designed and written. The data structure interfaces directly with both software packages and allows the ice sheet model to access matrix coefficients by row and column number in roughly 100 nanoseconds while only storing non-zero entries of the matrix. No a priori knowledge of the matrix\u27s sparsity pattern is required. Both software packages were tested with matrices produced by the model and performance characteristics were measured arid compared with banded Gaussian elimination. When combined with high performance basic linear algebra subprograms (BLAS), the packages are as much as 5 to 7 times faster than banded Gaussian elimination. The BLAS produced by K. Goto of the University of Texas was used. Memory usage by the packages varted from slightly more than banded Gaussian elimination with UMFPACK, to as much as a 40% savings with SuperLU. In addition, the packages provide componentwise backward error measures and estimates of the matrix\u27s condition number. SuperLU is available for parallel computers as well as single processor computers. UMPACK is only for single processor computers. Both packages are also capable of efficiently solving the bordered matrix problem

    Book of Abstracts of the Sixth SIAM Workshop on Combinatorial Scientific Computing

    Get PDF
    Book of Abstracts of CSC14 edited by Bora UçarInternational audienceThe Sixth SIAM Workshop on Combinatorial Scientific Computing, CSC14, was organized at the Ecole Normale Supérieure de Lyon, France on 21st to 23rd July, 2014. This two and a half day event marked the sixth in a series that started ten years ago in San Francisco, USA. The CSC14 Workshop's focus was on combinatorial mathematics and algorithms in high performance computing, broadly interpreted. The workshop featured three invited talks, 27 contributed talks and eight poster presentations. All three invited talks were focused on two interesting fields of research specifically: randomized algorithms for numerical linear algebra and network analysis. The contributed talks and the posters targeted modeling, analysis, bisection, clustering, and partitioning of graphs, applied in the context of networks, sparse matrix factorizations, iterative solvers, fast multi-pole methods, automatic differentiation, high-performance computing, and linear programming. The workshop was held at the premises of the LIP laboratory of ENS Lyon and was generously supported by the LABEX MILYON (ANR-10-LABX-0070, Université de Lyon, within the program ''Investissements d'Avenir'' ANR-11-IDEX-0007 operated by the French National Research Agency), and by SIAM

    The organization of circuit analysis on array architectures

    Get PDF
    • …
    corecore