Search CORE

26,652 research outputs found

Differential qd algorithm with shifts for rank-structured matrices

Author: Zhlobich Pavel
Publication venue
Publication date: 01/01/2012
Field of study

Although QR iterations dominate in eigenvalue computations, there are several important cases when alternative LR-type algorithms may be preferable. In particular, in the symmetric tridiagonal case where differential qd algorithm with shifts (dqds) proposed by Fernando and Parlett enjoys often faster convergence while preserving high relative accuracy (that is not guaranteed in QR algorithm). In eigenvalue computations for rank-structured matrices QR algorithm is also a popular choice since, in the symmetric case, the rank structure is preserved. In the unsymmetric case, however, QR algorithm destroys the rank structure and, hence, LR-type algorithms come to play once again. In the current paper we discover several variants of qd algorithms for quasiseparable matrices. Remarkably, one of them, when applied to Hessenberg matrices becomes a direct generalization of dqds algorithm for tridiagonal matrices. Therefore, it can be applied to such important matrices as companion and confederate, and provides an alternative algorithm for finding roots of a polynomial represented in the basis of orthogonal polynomials. Results of preliminary numerical experiments are presented

arXiv.org e-Print Archive

CiteSeerX

Analysis of a Classical Matrix Preconditioning Algorithm

Author: Chen T.-Y.
Kressner D.
Trefethen L. N.
Publication venue
Publication date: 01/06/2015
Field of study

We study a classical iterative algorithm for balancing matrices in the

L_\infty

norm via a scaling transformation. This algorithm, which goes back to Osborne and Parlett \& Reinsch in the 1960s, is implemented as a standard preconditioner in many numerical linear algebra packages. Surprisingly, despite its widespread use over several decades, no bounds were known on its rate of convergence. In this paper we prove that, for any irreducible

n\times n

(real or complex) input matrix~

A

, a natural variant of the algorithm converges in

O(n^3\log(n\rho/\varepsilon))

elementary balancing operations, where

\rho

measures the initial imbalance of~

A

and

\varepsilon

is the target imbalance of the output matrix. (The imbalance of~

A

\max_i |\log(a_i^{\text{out}}/a_i^{\text{in}})|

, where

a_i^{\text{out}},a_i^{\text{in}}

are the maximum entries in magnitude in the

i

th row and column respectively.) This bound is tight up to the

\log n

factor. A balancing operation scales the

i

th row and column so that their maximum entries are equal, and requires

O(m/n)

arithmetic operations on average, where

m

is the number of non-zero elements in~

A

. Thus the running time of the iterative algorithm is

\tilde{O}(n^2m)

. This is the first time bound of any kind on any variant of the Osborne-Parlett-Reinsch algorithm. We also prove a conjecture of Chen that characterizes those matrices for which the limit of the balancing process is independent of the order in which balancing operations are performed.Comment: The previous version (1) (see also STOC'15) handled UB ("unique balance") input matrices. In this version (2) we extend the work to handle all input matrice

arXiv.org e-Print Archive

Crossref

Caltech Authors

Improved Accuracy and Parallelism for MRRR-based Eigensolvers -- A Mixed Precision Approach

Author: Bientinesi Paolo
Petschow Matthias
Quintana-Orti Enrique
Publication venue
Publication date: 01/01/2013
Field of study

The real symmetric tridiagonal eigenproblem is of outstanding importance in numerical computations; it arises frequently as part of eigensolvers for standard and generalized dense Hermitian eigenproblems that are based on a reduction to tridiagonal form. For its solution, the algorithm of Multiple Relatively Robust Representations (MRRR) is among the fastest methods. Although fast, the solvers based on MRRR do not deliver the same accuracy as competing methods like Divide & Conquer or the QR algorithm. In this paper, we demonstrate that the use of mixed precisions leads to improved accuracy of MRRR-based eigensolvers with limited or no performance penalty. As a result, we obtain eigensolvers that are not only equally or more accurate than the best available methods, but also -in most circumstances- faster and more scalable than the competition

arXiv.org e-Print Archive

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositori Institucional de la Universitat Jaume I

Publikationsserver der RWTH Aachen University

Achieving Efficient Strong Scaling with PETSc using Hybrid MPI/OpenMP Optimisation

Author: G. Goumas
G. Schubert
G. Wellein
M. Butler
M.D. Piggott
N. Bell
P. Balaji
S. Williams
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

The increasing number of processing elements and decreas- ing memory to core ratio in modern high-performance platforms makes efficient strong scaling a key requirement for numerical algorithms. In order to achieve efficient scalability on massively parallel systems scientific software must evolve across the entire stack to exploit the multiple levels of parallelism exposed in modern architectures. In this paper we demonstrate the use of hybrid MPI/OpenMP parallelisation to optimise parallel sparse matrix-vector multiplication in PETSc, a widely used scientific library for the scalable solution of partial differential equations. Using large matrices generated by Fluidity, an open source CFD application code which uses PETSc as its linear solver engine, we evaluate the effect of explicit communication overlap using task-based parallelism and show how to further improve performance by explicitly load balancing threads within MPI processes. We demonstrate a significant speedup over the pure-MPI mode and efficient strong scaling of sparse matrix-vector multiplication on Fujitsu PRIMEHPC FX10 and Cray XE6 systems

arXiv.org e-Print Archive

CiteSeerX

Crossref

Spiral - Imperial College Digital Repository