Search CORE

1,177 research outputs found

A domain decomposing parallel sparse linear system solver

Author: Amestoy
Amestoy
Amestoy
Amestoy
Barrett
Benzi
Benzi
Benzi
Berry
Chen
Dongarra
Dongarra
Gravvanis
Gravvanis
Gravvanis
Karypis
Karypis
Lawrie
Lawson
Li
Manguoglu
Manguoglu
Murat Manguoglu
Polizzi
Polizzi
Sameh
Schenk
Schenk
Schenk
Publication venue: 'Elsevier BV'
Publication date: 26/08/2011
Field of study

The solution of large sparse linear systems is often the most time-consuming part of many science and engineering applications. Computational fluid dynamics, circuit simulation, power network analysis, and material science are just a few examples of the application areas in which large sparse linear systems need to be solved effectively. In this paper we introduce a new parallel hybrid sparse linear system solver for distributed memory architectures that contains both direct and iterative components. We show that by using our solver one can alleviate the drawbacks of direct and iterative solvers, achieving better scalability than with direct solvers and more robustness than with classical preconditioned iterative solvers. Comparisons to well-known direct and iterative solvers on a parallel architecture are provided.Comment: To appear in Journal of Computational and Applied Mathematic

arXiv.org e-Print Archive

Crossref

OpenMETU (Middle East Technical University)

Sympiler: Transforming Sparse Matrix Codes by Decoupling Symbolic Analysis

Author: Cheshmi Kazem
Dehnavi Maryam Mehri
Kamil Shoaib
Strout Michelle Mills
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 18/05/2017
Field of study

Sympiler is a domain-specific code generator that optimizes sparse matrix computations by decoupling the symbolic analysis phase from the numerical manipulation stage in sparse codes. The computation patterns in sparse numerical methods are guided by the input sparsity structure and the sparse algorithm itself. In many real-world simulations, the sparsity pattern changes little or not at all. Sympiler takes advantage of these properties to symbolically analyze sparse codes at compile-time and to apply inspector-guided transformations that enable applying low-level transformations to sparse codes. As a result, the Sympiler-generated code outperforms highly-optimized matrix factorization codes from commonly-used specialized libraries, obtaining average speedups over Eigen and CHOLMOD of 3.8X and 1.5X respectively.Comment: 12 page

arXiv.org e-Print Archive

Crossref

Fast iterative solution of reaction-diffusion control problems arising from chemical processes

Author: Ascher U.
Fletcher R.
Hinze M.
Hinze M.
John W. Pearson
Martin Stoll
Pearson J. W.
Rees T.
Wathen A. J.
Publication venue: SIAM
Publication date: 01/01/2012
Field of study

PDE-constrained optimization problems, and the development of preconditioned iterative methods for the efficient solution of the arising matrix system, is a field of numerical analysis that has recently been attracting much attention. In this paper, we analyze and develop preconditioners for matrix systems that arise from the optimal control of reaction-diffusion equations, which themselves result from chemical processes. Important aspects in our solvers are saddle point theory, mass matrix representation and effective Schur complement approximation, as well as the outer (Newton) iteration to take account of the nonlinearity of the underlying PDEs

CiteSeerX

Crossref

Oxford University Research Archive

Kent Academic Repository

MPG.PuRe

Adapting the interior point method for the solution of LPs on serial, coarse grain parallel and massively parallel computers

Author: Andersen J
Levkovitz R
Mitra G
Tamiz M
Publication venue: Brunel University
Publication date: 01/01/1990
Field of study

In this paper we describe a unified scheme for implementing an interior point algorithm (IPM) over a range of computer architectures. In the inner iteration of the IPM a search direction is computed using Newton's method. Computationally this involves solving a sparse symmetric positive definite (SSPD) system of equations. The choice of direct and indirect methods for the solution of this system, and the design of data structures to take advantage of serial, coarse grain parallel and massively parallel computer architectures, are considered in detail. We put forward arguments as to why integration of the system within a sparse simplex solver is important and outline how the system is designed to achieve this integration

CiteSeerX

Brunel University Research Archive

Properties of approximate inverses and adaptive control concepts for preconditioning [online]

Author: Koschinski Claus
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/1999
Field of study

KITopen

A Parallel Solver for Graph Laplacians

Author: Boman Erik G.
Brannick James
Kepner Jeremy
Napov Artem
Ruge John W.
Spielman Daniel A.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 11/07/2018
Field of study

Problems from graph drawing, spectral clustering, network flow and graph partitioning can all be expressed in terms of graph Laplacian matrices. There are a variety of practical approaches to solving these problems in serial. However, as problem sizes increase and single core speeds stagnate, parallelism is essential to solve such problems quickly. We present an unsmoothed aggregation multigrid method for solving graph Laplacians in a distributed memory setting. We introduce new parallel aggregation and low degree elimination algorithms targeted specifically at irregular degree graphs. These algorithms are expressed in terms of sparse matrix-vector products using generalized sum and product operations. This formulation is amenable to linear algebra using arbitrary distributions and allows us to operate on a 2D sparse matrix distribution, which is necessary for parallel scalability. Our solver outperforms the natural parallel extension of the current state of the art in an algorithmic comparison. We demonstrate scalability to 576 processes and graphs with up to 1.7 billion edges.Comment: PASC '18, Code: https://github.com/ligmg/ligm

arXiv.org e-Print Archive

Crossref

A Novel Partitioning Method for Accelerating the Block Cimmino Algorithm

Author: Aykanat Cevdet
Manguoglu Murat
Torun F. Sukru
Publication venue
Publication date: 01/01/2018
Field of study

We propose a novel block-row partitioning method in order to improve the convergence rate of the block Cimmino algorithm for solving general sparse linear systems of equations. The convergence rate of the block Cimmino algorithm depends on the orthogonality among the block rows obtained by the partitioning method. The proposed method takes numerical orthogonality among block rows into account by proposing a row inner-product graph model of the coefficient matrix. In the graph partitioning formulation defined on this graph model, the partitioning objective of minimizing the cutsize directly corresponds to minimizing the sum of inter-block inner products between block rows thus leading to an improvement in the eigenvalue spectrum of the iteration matrix. This in turn leads to a significant reduction in the number of iterations required for convergence. Extensive experiments conducted on a large set of matrices confirm the validity of the proposed method against a state-of-the-art method

arXiv.org e-Print Archive

OpenMETU (Middle East Technical University)

An Arnoldi-frontal approach for the stability analysis of flows in a collapsible channel

Author: Cai Zongxi
Hao Yujue
Luo Xiaoyu
Roper Steven
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 01/09/2016
Field of study

In this paper, we present a new approach based on a combination of the Arnoldi and frontal methods for solving large sparse asymmetric and generalized complex eigenvalue problems. The new eigensolver seeks the most unstable eigensolution in the Krylov subspace and makes use of the efficiency of the frontal solver developed for the finite element methods. The approach is used for a stability analysis of flows in a collapsible channel and is found to significantly improve the computational efficiency compared to the traditionally used QZ solver or a standard Arnoldi method. With the new approach, we are able to validate the previous results obtained either on a much coarser mesh or estimated from unsteady simulations. New neutral stability solutions of the system have been obtained which are beyond the limits of previously used methods

Crossref

Enlighten