Search CORE

970 research outputs found

Parallel Factorizations in Numerical Analysis

Author: Amodio Pierluigi
Brugnano Luigi
Publication venue
Publication date: 01/01/2009
Field of study

In this paper we review the parallel solution of sparse linear systems, usually deriving by the discretization of ODE-IVPs or ODE-BVPs. The approach is based on the concept of parallel factorization of a (block) tridiagonal matrix. This allows to obtain efficient parallel extensions of many known matrix factorizations, and to derive, as a by-product, a unifying approach to the parallel solution of ODEs.Comment: 15 pages, 5 figure

arXiv.org e-Print Archive

CiteSeerX

Archivio istituzionale della ricerca - Università di Bari

A Direct Elliptic Solver Based on Hierarchically Low-rank Schur Complements

Author: A. Aminfar
B.L. Buzbee
I. Ibragimov
J. Xia
J. Xia
J. Xia
L. Grasedyck
P. Amestoy
P. Swarztrauber
P.G. Schmitz
P.G. Schmitz
R.W. Hockney
S. Ambikasaran
S. Chandrasekaran
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 03/04/2016
Field of study

A parallel fast direct solver for rank-compressible block tridiagonal linear systems is presented. Algorithmic synergies between Cyclic Reduction and Hierarchical matrix arithmetic operations result in a solver with

O(N \log^2 N)

arithmetic complexity and

O(N \log N)

memory footprint. We provide a baseline for performance and applicability by comparing with well known implementations of the

\mathcal{H}

-LU factorization and algebraic multigrid with a parallel implementation that leverages the concurrency features of the method. Numerical experiments reveal that this method is comparable with other fast direct solvers based on Hierarchical Matrices such as

\mathcal{H}

-LU and that it can tackle problems where algebraic multigrid fails to converge

arXiv.org e-Print Archive

Crossref

Differential qd algorithm with shifts for rank-structured matrices

Author: Zhlobich Pavel
Publication venue
Publication date: 01/01/2012
Field of study

Although QR iterations dominate in eigenvalue computations, there are several important cases when alternative LR-type algorithms may be preferable. In particular, in the symmetric tridiagonal case where differential qd algorithm with shifts (dqds) proposed by Fernando and Parlett enjoys often faster convergence while preserving high relative accuracy (that is not guaranteed in QR algorithm). In eigenvalue computations for rank-structured matrices QR algorithm is also a popular choice since, in the symmetric case, the rank structure is preserved. In the unsymmetric case, however, QR algorithm destroys the rank structure and, hence, LR-type algorithms come to play once again. In the current paper we discover several variants of qd algorithms for quasiseparable matrices. Remarkably, one of them, when applied to Hessenberg matrices becomes a direct generalization of dqds algorithm for tridiagonal matrices. Therefore, it can be applied to such important matrices as companion and confederate, and provides an alternative algorithm for finding roots of a polynomial represented in the basis of orthogonal polynomials. Results of preliminary numerical experiments are presented

arXiv.org e-Print Archive

CiteSeerX

A new approximate matrix factorization for implicit time integration in air pollution modeling

Author: Botchev M.A.
Verwer J.G.
Publication venue: Elsevier
Publication date: 01/01/2003
Field of study

Implicit time stepping typically requires solution of one or several linear systems with a matrix I−τJ per time step where J is the Jacobian matrix. If solution of these systems is expensive, replacing I−τJ with its approximate matrix factorization (AMF) (I−τR)(I−τV), R+V=J, often leads to a good compromise between stability and accuracy of the time integration on the one hand and its efficiency on the other hand. For example, in air pollution modeling, AMF has been successfully used in the framework of Rosenbrock schemes. The standard AMF gives an approximation to I−τJ with the error τ2RV, which can be significant in norm. In this paper we propose a new AMF. In assumption that −V is an M-matrix, the error of the new AMF can be shown to have an upper bound τ||R||, while still being asymptotically

O(\tau^2)

. This new AMF, called AMF+, is equal in costs to standard AMF and, as both analysis and numerical experiments reveal, provides a better accuracy. We also report on our experience with another, cheaper AMF and with AMF-preconditioned GMRES

CiteSeerX

Elsevier - Publisher Connector

University of Twente Research Information

International Migration, Integration and Social Cohesion online publications

Improving approximate matrix factorizations for implicit time integration in air pollution modelling

Author: Botchev M.A.
Verwer J.G.
Publication venue: CWI (Centrum voor Wiskunde en Informatica)
Publication date: 01/01/2000
Field of study

For a long time operator splitting was the only computationally feasible way of implicit time integration in large scale Air Pollution Models. A recently proposed attractive alternative is Rosenbrock schemes combined with Approximate Matrix Factorization (AMF). With AMF, linear systems arising in implicit time stepping are solved approximately in such a way that the overall computational costs per time step are not higher than those of splitting methods. We propose and discuss two new variants of AMF. The first one is aimed at yet a further reduction of costs as compared with conventional AMF. The second variant of AMF provides in certain circumstances a better approximation to the inverse of the linear system matrix than standard AMF and requires the same computational work

CiteSeerX

CWI's Institutional Repository

University of Twente Research Information

Minimizing Communication for Eigenproblems and the Singular Value Decomposition

Author: Ballard Grey
Demmel James
Dumitriu Ioana
Publication venue
Publication date: 01/01/2010
Field of study

Algorithms have two costs: arithmetic and communication. The latter represents the cost of moving data, either between levels of a memory hierarchy, or between processors over a network. Communication often dominates arithmetic and represents a rapidly increasing proportion of the total cost, so we seek algorithms that minimize communication. In \cite{BDHS10} lower bounds were presented on the amount of communication required for essentially all

O(n^3)

-like algorithms for linear algebra, including eigenvalue problems and the SVD. Conventional algorithms, including those currently implemented in (Sca)LAPACK, perform asymptotically more communication than these lower bounds require. In this paper we present parallel and sequential eigenvalue algorithms (for pencils, nonsymmetric matrices, and symmetric matrices) and SVD algorithms that do attain these lower bounds, and analyze their convergence and communication costs.Comment: 43 pages, 11 figure

arXiv.org e-Print Archive

CiteSeerX

Parallelization of implicit finite difference schemes in computational fluid dynamics

Author: Decker Naomi H.
Naik Vijay K.
Nicoules Michel
Publication venue
Publication date
Field of study

Implicit finite difference schemes are often the preferred numerical schemes in computational fluid dynamics, requiring less stringent stability bounds than the explicit schemes. Each iteration in an implicit scheme involves global data dependencies in the form of second and higher order recurrences. Efficient parallel implementations of such iterative methods are considerably more difficult and non-intuitive. The parallelization of the implicit schemes that are used for solving the Euler and the thin layer Navier-Stokes equations and that require inversions of large linear systems in the form of block tri-diagonal and/or block penta-diagonal matrices is discussed. Three-dimensional cases are emphasized and schemes that minimize the total execution time are presented. Partitioning and scheduling schemes for alleviating the effects of the global data dependencies are described. An analysis of the communication and the computation aspects of these methods is presented. The effect of the boundary conditions on the parallel schemes is also discussed

NASA Technical Reports Server

LU factorization with panel rank revealing pivoting and its communication avoiding version

Author: Demmel James W.
Grigori Laura
Gu Ming
Khabou Amal
Publication venue
Publication date: 01/01/2012
Field of study

We present the LU decomposition with panel rank revealing pivoting (LU_PRRP), an LU factorization algorithm based on strong rank revealing QR panel factorization. LU_PRRP is more stable than Gaussian elimination with partial pivoting (GEPP). Our extensive numerical experiments show that the new factorization scheme is as numerically stable as GEPP in practice, but it is more resistant to pathological cases and easily solves the Wilkinson matrix and the Foster matrix. We also present CALU_PRRP, a communication avoiding version of LU_PRRP that minimizes communication. CALU_PRRP is based on tournament pivoting, with the selection of the pivots at each step of the tournament being performed via strong rank revealing QR factorization. CALU_PRRP is more stable than CALU, the communication avoiding version of GEPP. CALU_PRRP is also more stable in practice and is resistant to pathological cases on which GEPP and CALU fail.Comment: No. RR-7867 (2012

arXiv.org e-Print Archive

HAL-CentraleSupelec

CiteSeerX

HAL - Lille 3

INRIA a CCSD electronic archive server

MIMS EPrints

Hal-Diderot

HAL-Rennes 1