Search CORE

96 research outputs found

On aggressive early deflation in parallel variants of the QR algorithm

Author: Kagstrom Bo
Kressner Daniel
Shao Meiyue
Publication venue
Publication date: 05/05/2011
Field of study

Infoscience - École polytechnique fédérale de Lausanne

Optimally packed chains of bulges in multishift QR algorithms

Author: Bruno Lang
Daniel Kressner
Granat R.
Kressner D.
Kressner D.
Lars Karlsson
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

On pole-swapping algorithms for the eigenvalue problem

Author: Camps Daan
Mach Thomas
Vandebril Raf
Watkins David S.
Publication venue
Publication date: 01/01/2020
Field of study

Pole-swapping algorithms, which are generalizations of the QZ algorithm for the generalized eigenvalue problem, are studied. A new modular (and therefore more flexible) convergence theory that applies to all pole-swapping algorithms is developed. A key component of all such algorithms is a procedure that swaps two adjacent eigenvalues in a triangular pencil. An improved swapping routine is developed, and its superiority over existing methods is demonstrated by a backward error analysis and numerical tests. The modularity of the new convergence theory and the generality of the pole-swapping approach shed new light on bi-directional chasing algorithms, optimally packed shifts, and bulge pencils, and allow the design of novel algorithms

arXiv.org e-Print Archive

eScholarship - University of California

Elektronisches Publikationsportal der Ãsterreichischen Akademie der Wissenschaften

Elektronisches Publikationsportal der Österreichischen Akademie der Wissenschaften

A Householder-based algorithm for Hessenberg-triangular reduction

Author: Bujanović Zvonimir
Karlsson Lars
Kressner Daniel
Publication venue
Publication date: 29/05/2018
Field of study

The QZ algorithm for computing eigenvalues and eigenvectors of a matrix pencil

A - \lambda B

requires that the matrices first be reduced to Hessenberg-triangular (HT) form. The current method of choice for HT reduction relies entirely on Givens rotations regrouped and accumulated into small dense matrices which are subsequently applied using matrix multiplication routines. A non-vanishing fraction of the total flop count must nevertheless still be performed as sequences of overlapping Givens rotations alternately applied from the left and from the right. The many data dependencies associated with this computational pattern leads to inefficient use of the processor and poor scalability. In this paper, we therefore introduce a fundamentally different approach that relies entirely on (large) Householder reflectors partially accumulated into block reflectors, by using (compact) WY representations. Even though the new algorithm requires more floating point operations than the state of the art algorithm, extensive experiments on both real and synthetic data indicate that it is still competitive, even in a sequential setting. The new algorithm is conjectured to have better parallel scalability, an idea which is partially supported by early small-scale experiments using multi-threaded BLAS. The design and evaluation of a parallel formulation is future work

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

A Novel Parallel QR Algorithm For Hybrid Distributed Memory HPC Systems

Author: Granat Robert
Kagstrom Bo
Kressner Daniel
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 05/05/2011
Field of study

A novel variant of the parallel QR algorithm for solving dense nonsymmetric eigenvalue problems on hybrid distributed high performance computing systems is presented. For this purpose, we introduce the concept of multiwindow bulge chain chasing and parallelize aggressive early deflation. The multiwindow approach ensures that most computations when chasing chains of bulges are performed in level 3 BLAS operations, while the aim of aggressive early deflation is to speed up the convergence of the QR algorithm. Mixed MPI-OpenMP coding techniques are utilized for porting the codes to distributed memory platforms with multithreaded nodes, such as multicore processors. Numerous numerical experiments confirm the superior performance of our parallel QR algorithm in comparison with the existing ScaLAPACK code, leading to an implementation that is one to two orders of magnitude faster for sufficiently large problems, including a number of examples from applications

Infoscience - École polytechnique fédérale de Lausanne

A multishift, multipole rational QZ method with aggressive early deflation

Author: Camps Daan
Meerbergen Karl
Steel Thijs
Vandebril Raf
Publication venue
Publication date: 19/05/2020
Field of study

The rational QZ method generalizes the QZ method by implicitly supporting rational subspace iteration. In this paper we extend the rational QZ method by introducing shifts and poles of higher multiplicity in the Hessenberg pencil, which is a pencil consisting of two Hessenberg matrices. The result is a multishift, multipole iteration on block Hessenberg pencils which allows one to stick to real arithmetic for a real input pencil. In combination with optimally packed shifts and aggressive early deflation as an advanced deflation technique we obtain an efficient method for the dense generalized eigenvalue problem. In the numerical experiments we compare the results with state-of-the-art routines for the generalized eigenvalue problem and show that we are competitive in terms of speed and accuracy

arXiv.org e-Print Archive

eScholarship - University of California

A new deflation criterion for the QZ algorithm

Author: Langou Julien
Steel Thijs
Vandebril Raf
Publication venue
Publication date: 03/08/2022
Field of study

The QZ algorithm computes the Schur form of a matrix pencil. It is an iterative algorithm and at some point, it must decide that an eigenvalue has converged and move on with another one. Choosing a criterion that makes this decision is nontrivial. If it is too strict, the algorithm might waste iterations on already converged eigenvalues. If it is not strict enough, the computed eigenvalues might be inaccurate. Additionally, the criterion should not be computationally expensive to evaluate. This paper introduces a new criterion based on the size of and the gap between the eigenvalues. This is similar to the work of Ahues and Tissuer for the QR algorithm. Theoretical arguments and numerical experiments suggest that it outperforms the most popular criteria in terms of accuracy. Additionally, this paper evaluates some commonly used criteria for infinite eigenvalues.Comment: 10 pages, 6 figure

arXiv.org e-Print Archive

Multishift variants of the QZ algorithm with aggressive early deflation

Author: Kagstrom Bo
Kressner Daniel
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 05/05/2011
Field of study

New variants of the QZ algorithm for solving the generalized eigenvalue problem are proposed. An extension of the small-bulge multishift QR algorithm is developed, which chases chains of many small bulges instead of only one bulge in each QZ iteration. This allows the effective use of level 3 BLAS operations, which in turn can provide efficient utilization of high performance computing systems with deep memory hierarchies. Moreover, an extension of the aggressive early deflation strategy is proposed, which can identify and de. ate converged eigenvalues long before classic deflation strategies would. Consequently, the number of overall QZ iterations needed until convergence is considerably reduced. As a third ingredient, we reconsider the deflation of infinite eigenvalues and present a new deflation algorithm, which is particularly effective in the presence of a large number of infinite eigenvalues. Combining all these developments, our implementation significantly improves existing implementations of the QZ algorithm. This is demonstrated by numerical experiments with random matrix pairs as well as with matrix pairs arising from various applications

Infoscience - École polytechnique fédérale de Lausanne

Dense and Structured Matrix Computations:the Parallel QR Algorithm and Matrix Exponentials

Author: Shao Meiyue
Publication venue: Lausanne, EPFL
Publication date: 14/01/2014
Field of study

Infoscience - École polytechnique fédérale de Lausanne

A parallel Schur method for solving continuous-time algebraic Riccati equations

Author: Granat Robert
Kagstrom Bo
Kressner Daniel
Publication venue: Ieee Service Center, 445 Hoes Lane, Po Box 1331, Piscataway, Nj 08855-1331 Usa
Publication date: 05/05/2011
Field of study

Numerical algorithms for solving the continuous-time algebraic Riccati matrix equation on a distributed memory parallel computer are considered. In particular, it is shown that the Schur method, based on computing the stable invariant subspace of a Hamiltonian matrix, can be parallelized in an efficient and scalable way. Our implementation employs the state-of-the-art library ScaLAPACK as well as recently developed parallel methods for reordering the eigenvalues in a real Schur form. Some experimental results are presented, confirming the scalability of our implementation and comparing it with an existing implementation of the matrix sign iteration from the PLiCOC library

Infoscience - École polytechnique fédérale de Lausanne