62,070 research outputs found
Applications and accuracy of the parallel diagonal dominant algorithm
The Parallel Diagonal Dominant (PDD) algorithm is a highly efficient, ideally scalable tridiagonal solver. In this paper, a detailed study of the PDD algorithm is given. First the PDD algorithm is introduced. Then the algorithm is extended to solve periodic tridiagonal systems. A variant, the reduced PDD algorithm, is also proposed. Accuracy analysis is provided for a class of tridiagonal systems, the symmetric, and anti-symmetric Toeplitz tridiagonal systems. Implementation results show that the analysis gives a good bound on the relative error, and the algorithm is a good candidate for the emerging massively parallel machines
A communication-less parallel algorithm for tridiagonal Toeplitz systems
AbstractDiagonally dominant tridiagonal Toeplitz systems of linear equations arise in many application areas and have been well studied in the past. Modern interest in numerical linear algebra is often focusing on solving classic problems in parallel. In McNally [Fast parallel algorithms for tri-diagonal symmetric Toeplitz systems, MCS Thesis, University of New Brunswick, Saint John, 1999], an m processor Split & Correct algorithm was presented for approximating the solution to a symmetric tridiagonal Toeplitz linear system of equations. Nemani [Perturbation methods for circulant-banded systems and their parallel implementation, Ph.D. Thesis, University of New Brunswick, Saint John, 2001] and McNally (2003) adapted the works of Rojo [A new method for solving symmetric circulant tri-diagonal system of linear equations, Comput. Math. Appl. 20 (1990) 61–67], Yan and Chung [A fast algorithm for solving special tri-diagonal systems, Computing 52 (1994) 203–211] and McNally et al. [A split-correct parallel algorithm for solving tri-diagonal symmetric Toeplitz systems, Internat. J. Comput. Math. 75 (2000) 303–313] to the non-symmetric case. In this paper we present relevant background from these methods and then introduce an m processor scalable communication-less approximation algorithm for solving a diagonally dominant tridiagonal Toeplitz system of linear equations
Efficient Parallel Kernel Solvers for Computational Fluid Dynamics Applications
Distributed-memory parallel computers dominate today's parallel computing arena. These machines, such as Intel Paragon, IBM SP2, and Cray Origin2OO, have successfully delivered high performance computing power for solving some of the so-called "grand-challenge" problems. Despite initial success, parallel machines have not been widely accepted in production engineering environments due to the complexity of parallel programming. On a parallel computing system, a task has to be partitioned and distributed appropriately among processors to reduce communication cost and to attain load balance. More importantly, even with careful partitioning and mapping, the performance of an algorithm may still be unsatisfactory, since conventional sequential algorithms may be serial in nature and may not be implemented efficiently on parallel machines. In many cases, new algorithms have to be introduced to increase parallel performance. In order to achieve optimal performance, in addition to partitioning and mapping, a careful performance study should be conducted for a given application to find a good algorithm-machine combination. This process, however, is usually painful and elusive. The goal of this project is to design and develop efficient parallel algorithms for highly accurate Computational Fluid Dynamics (CFD) simulations and other engineering applications. The work plan is 1) developing highly accurate parallel numerical algorithms, 2) conduct preliminary testing to verify the effectiveness and potential of these algorithms, 3) incorporate newly developed algorithms into actual simulation packages. The work plan has well achieved. Two highly accurate, efficient Poisson solvers have been developed and tested based on two different approaches: (1) Adopting a mathematical geometry which has a better capacity to describe the fluid, (2) Using compact scheme to gain high order accuracy in numerical discretization. The previously developed Parallel Diagonal Dominant (PDD) algorithm and Reduced Parallel Diagonal Dominant (RPDD) algorithm have been carefully studied on different parallel platforms for different applications, and a NASA simulation code developed by Man M. Rai and his colleagues has been parallelized and implemented based on data dependency analysis. These achievements are addressed in detail in the paper
An Efficient Parallel Solver for SDD Linear Systems
We present the first parallel algorithm for solving systems of linear
equations in symmetric, diagonally dominant (SDD) matrices that runs in
polylogarithmic time and nearly-linear work. The heart of our algorithm is a
construction of a sparse approximate inverse chain for the input matrix: a
sequence of sparse matrices whose product approximates its inverse. Whereas
other fast algorithms for solving systems of equations in SDD matrices exploit
low-stretch spanning trees, our algorithm only requires spectral graph
sparsifiers
An Efficient Parallel Algorithm for Spectral Sparsification of Laplacian and SDDM Matrix Polynomials
For "large" class of continuous probability density functions
(p.d.f.), we demonstrate that for every there is mixture of
discrete Binomial distributions (MDBD) with
distinct Binomial distributions that -approximates a
discretized p.d.f. for all , where
. Also, we give two efficient parallel
algorithms to find such MDBD.
Moreover, we propose a sequential algorithm that on input MDBD with
for that induces a discretized p.d.f. ,
that is either Laplacian or SDDM matrix and parameter ,
outputs in time a spectral
sparsifier of a matrix-polynomial, where
notation hides factors.
This improves the Cheng et al.'s [CCLPT15] algorithm whose run time is
.
Furthermore, our algorithm is parallelizable and runs in work
and depth . Our main algorithmic contribution is to
propose the first efficient parallel algorithm that on input continuous p.d.f.
, matrix as above, outputs a spectral sparsifier of
matrix-polynomial whose coefficients approximate component-wise the discretized
p.d.f. .
Our results yield the first efficient and parallel algorithm that runs in
nearly linear work and poly-logarithmic depth and analyzes the long term
behaviour of Markov chains in non-trivial settings. In addition, we strengthen
the Spielman and Peng's [PS14] parallel SDD solver
Analysis of A Splitting Approach for the Parallel Solution of Linear Systems on GPU Cards
We discuss an approach for solving sparse or dense banded linear systems
on a Graphics Processing Unit (GPU) card. The
matrix is possibly nonsymmetric and
moderately large; i.e., . The ${\it split\ and\
parallelize}{\tt SaP}{\bf A}{\bf A}_ii=1,\ldots,P{\bf A}_i{\tt SaP::GPU}{\tt PARDISO}{\tt SuperLU}{\tt MUMPS}{\tt SaP::GPU}{\tt MKL}{\tt SaP::GPU}{\tt SaP::GPU}$ is publicly available and distributed as
open source under a permissive BSD3 license.Comment: 38 page
- …