Search CORE

11,185 research outputs found

Lecture 10: Preconditioned Iterative Methods for Linear Systems

Author: Chow Edmond
Publication venue: ScholarWorks@UARK
Publication date: 05/04/2021
Field of study

Iterative methods for the solution of linear systems of equations – such as stationary, semi-iterative, and Krylov subspace methods – are classical methods taught in numerical analysis courses, but adapting these methods to run efficiently at large-scale on high-performance computers is challenging and a constantly evolving topic. Preconditioners – necessary to aid the convergence of iterative methods – come in many forms, from algebraic to physics-based, are regularly being developed for linear systems from different classes of problems, and similarly are evolving with high-performance computers. This lecture will cover the background and some recent developments on iterative methods and preconditioning in the context of high-performance parallel computers. Topics include asynchronous iterative methods that avoid the potentially high synchronization cost where there are very large numbers of computational threads, parallel sparse approximate inverse preconditioners, parallel incomplete factorization preconditioners and sparse triangular solvers, and preconditioning with hierarchical rank-structured matrices for kernel matrix equations

ScholarWorks@UARK

Geometry-Oblivious FMM for Compressing Dense SPD Matrices

Author: Biros George
Levitt James
Reiz Severin
Yu Chenhan D.
Publication venue
Publication date: 01/07/2017
Field of study

We present GOFMM (geometry-oblivious FMM), a novel method that creates a hierarchical low-rank approximation, "compression," of an arbitrary dense symmetric positive definite (SPD) matrix. For many applications, GOFMM enables an approximate matrix-vector multiplication in

N \log N

or even

N

time, where

N

is the matrix size. Compression requires

N \log N

storage and work. In general, our scheme belongs to the family of hierarchical matrix approximation methods. In particular, it generalizes the fast multipole method (FMM) to a purely algebraic setting by only requiring the ability to sample matrix entries. Neither geometric information (i.e., point coordinates) nor knowledge of how the matrix entries have been generated is required, thus the term "geometry-oblivious." Also, we introduce a shared-memory parallel scheme for hierarchical matrix computations that reduces synchronization barriers. We present results on the Intel Knights Landing and Haswell architectures, and on the NVIDIA Pascal architecture for a variety of matrices.Comment: 13 pages, accepted by SC'1

arXiv.org e-Print Archive

Crossref

Adapting the interior point method for the solution of linear programs on high performance computers

Author: Ashcroft
Bixby
Chen
Duff
Forrest
Gay
George
Golub
Karmarkar
Lai
Liu
Megiddo
Monteiro
Publication venue: Brunel University
Publication date: 01/01/1991
Field of study

In this paper we describe a unified algorithmic framework for the interior point method (IPM) of solving Linear Programs (LPs) which allows us to adapt it over a range of high performance computer architectures. We set out the reasons as to why IPM makes better use of high performance computer architecture than the sparse simplex method. In the inner iteration of the IPM a search direction is computed using Newton or higher order methods. Computationally this involves solving a sparse symmetric positive definite (SSPD) system of equations. The choice of direct and indirect methods for the solution of this system and the design of data structures to take advantage of coarse grain parallel and massively parallel computer architectures are considered in detail. Finally, we present experimental results of solving NETLIB test problems on examples of these architectures and put forward arguments as to why integration of the system within sparse simplex is beneficial

CiteSeerX

Crossref

Brunel University Research Archive

Sympiler: Transforming Sparse Matrix Codes by Decoupling Symbolic Analysis

Author: Cheshmi Kazem
Dehnavi Maryam Mehri
Kamil Shoaib
Strout Michelle Mills
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 18/05/2017
Field of study

Sympiler is a domain-specific code generator that optimizes sparse matrix computations by decoupling the symbolic analysis phase from the numerical manipulation stage in sparse codes. The computation patterns in sparse numerical methods are guided by the input sparsity structure and the sparse algorithm itself. In many real-world simulations, the sparsity pattern changes little or not at all. Sympiler takes advantage of these properties to symbolically analyze sparse codes at compile-time and to apply inspector-guided transformations that enable applying low-level transformations to sparse codes. As a result, the Sympiler-generated code outperforms highly-optimized matrix factorization codes from commonly-used specialized libraries, obtaining average speedups over Eigen and CHOLMOD of 3.8X and 1.5X respectively.Comment: 12 page

arXiv.org e-Print Archive

Crossref