Search CORE

82 research outputs found

Restructuring the Tridiagonal and Bidiagonal QR Algorithms for Performance

Author: Quintana-Ortí Gregorio
Van de Geijn Robert A.
Van Zee Field G.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2014
Field of study

We show how both the tridiagonal and bidiagonal QR algorithms can be restructured so that they be- come rich in operations that can achieve near-peak performance on a modern processor. The key is a novel, cache-friendly algorithm for applying multiple sets of Givens rotations to the eigenvector/singular vector matrix. This algorithm is then implemented with optimizations that (1) leverage vector instruction units to increase floating-point throughput, and (2) fuse multiple rotations to decrease the total number of memory operations. We demonstrate the merits of these new QR algorithms for computing the Hermitian eigenvalue decomposition (EVD) and singular value decomposition (SVD) of dense matrices when all eigen- vectors/singular vectors are computed. The approach yields vastly improved performance relative to the traditional QR algorithms for these problems and is competitive with two commonly used alternatives— Cuppen’s Divide and Conquer algorithm and the Method of Multiple Relatively Robust Representations— while inheriting the more modest O(n) workspace requirements of the original QR algorithms. Since the computations performed by the restructured algorithms remain essentially identical to those performed by the original methods, robust numerical properties are preserved

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositori Institucional de la Universitat Jaume I

A Distributed and Incremental SVD Algorithm for Agglomerative Data Analysis on Large Networks

Author: Iwen M. A.
Ong B. W.
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/01/2016
Field of study

In this paper, we show that the SVD of a matrix can be constructed efficiently in a hierarchical approach. Our algorithm is proven to recover the singular values and left singular vectors if the rank of the input matrix

A

is known. Further, the hierarchical algorithm can be used to recover the

d

largest singular values and left singular vectors with bounded error. We also show that the proposed method is stable with respect to roundoff errors or corruption of the original matrix entries. Numerical experiments validate the proposed algorithms and parallel cost analysis

arXiv.org e-Print Archive

Michigan Technological University

Improved Accuracy and Parallelism for MRRR-based Eigensolvers -- A Mixed Precision Approach

Author: Bientinesi Paolo
Petschow Matthias
Quintana-Orti Enrique
Publication venue
Publication date: 01/01/2013
Field of study

The real symmetric tridiagonal eigenproblem is of outstanding importance in numerical computations; it arises frequently as part of eigensolvers for standard and generalized dense Hermitian eigenproblems that are based on a reduction to tridiagonal form. For its solution, the algorithm of Multiple Relatively Robust Representations (MRRR) is among the fastest methods. Although fast, the solvers based on MRRR do not deliver the same accuracy as competing methods like Divide & Conquer or the QR algorithm. In this paper, we demonstrate that the use of mixed precisions leads to improved accuracy of MRRR-based eigensolvers with limited or no performance penalty. As a result, we obtain eigensolvers that are not only equally or more accurate than the best available methods, but also -in most circumstances- faster and more scalable than the competition

arXiv.org e-Print Archive

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositori Institucional de la Universitat Jaume I

Publikationsserver der RWTH Aachen University

Perturbation splitting for more accurate eigenvalues

Author: Ralha Rui
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/01/2009
Field of study

Let

T

be a symmetric tridiagonal matrix with entries and eigenvalues of different magnitudes. For some

T

, small entrywise relative perturbations induce small errors in the eigenvalues, independently of the size of the entries of the matrix; this is certainly true when the perturbed matrix can be written as

\widetilde{T}=X^{T}TX

with small

||X^{T}X-I||

. Even if it is not possible to express in this way the perturbations in every entry of

T

, much can be gained by doing so for as many as possible entries of larger magnitude. We propose a technique which consists of splitting multiplicative and additive perturbations to produce new error bounds which, for some matrices, are much sharper than the usual ones. Such bounds may be useful in the development of improved software for the tridiagonal eigenvalue problem, and we describe their role in the context of a mixed precision bisection-like procedure. Using the very same idea of splitting perturbations (multiplicative and additive), we show that when

T

defines well its eigenvalues, the numerical values of the pivots in the usual decomposition

T-\lambda I=LDL^{T}

may be used to compute approximations with high relative precision.Fundação para a Ciência e Tecnologia (FCT) - POCI 201

CiteSeerX

Universidade do Minho: RepositoriUM

Parallel accelerated cyclic reduction preconditioner for three-dimensional elliptic PDEs with variable coefficients

Author: Chávez Gustavo
Keyes David
Turkiyyah George
Zampini Stefano
Publication venue: 'Elsevier BV'
Publication date: 23/12/2017
Field of study

We present a robust and scalable preconditioner for the solution of large-scale linear systems that arise from the discretization of elliptic PDEs amenable to rank compression. The preconditioner is based on hierarchical low-rank approximations and the cyclic reduction method. The setup and application phases of the preconditioner achieve log-linear complexity in memory footprint and number of operations, and numerical experiments exhibit good weak and strong scalability at large processor counts in a distributed memory environment. Numerical experiments with linear systems that feature symmetry and nonsymmetry, definiteness and indefiniteness, constant and variable coefficients demonstrate the preconditioner applicability and robustness. Furthermore, it is possible to control the number of iterations via the accuracy threshold of the hierarchical matrix approximations and their arithmetic operations, and the tuning of the admissibility condition parameter. Together, these parameters allow for optimization of the memory requirements and performance of the preconditioner.Comment: 24 pages, Elsevier Journal of Computational and Applied Mathematics, Dec 201

arXiv.org e-Print Archive

eScholarship - University of California

An improved parallel singular value algorithm and its implementation for multicore hardware

Author: Agullo E.
Anderson E.
Andrews H. C.
Bartels R. H.
Berry B. P. M.
Bečka M.
Bregman N.
Buttari A.
Buttari A.
Dongarra J.
Farra V.
Gansterer W.
Golub G. H.
Golub G. H.
Haidar A.
Haidar A.
Jiang E. P.
Kurzak J.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref