Search CORE

4,187 research outputs found

A Householder-based algorithm for Hessenberg-triangular reduction

Author: Bujanović Zvonimir
Karlsson Lars
Kressner Daniel
Publication venue
Publication date: 29/05/2018
Field of study

The QZ algorithm for computing eigenvalues and eigenvectors of a matrix pencil

A - \lambda B

requires that the matrices first be reduced to Hessenberg-triangular (HT) form. The current method of choice for HT reduction relies entirely on Givens rotations regrouped and accumulated into small dense matrices which are subsequently applied using matrix multiplication routines. A non-vanishing fraction of the total flop count must nevertheless still be performed as sequences of overlapping Givens rotations alternately applied from the left and from the right. The many data dependencies associated with this computational pattern leads to inefficient use of the processor and poor scalability. In this paper, we therefore introduce a fundamentally different approach that relies entirely on (large) Householder reflectors partially accumulated into block reflectors, by using (compact) WY representations. Even though the new algorithm requires more floating point operations than the state of the art algorithm, extensive experiments on both real and synthetic data indicate that it is still competitive, even in a sequential setting. The new algorithm is conjectured to have better parallel scalability, an idea which is partially supported by early small-scale experiments using multi-threaded BLAS. The design and evaluation of a parallel formulation is future work

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

An efficient multi-core implementation of a novel HSS-structured multifrontal solver using randomized sampling

Author: Ghysels Pieter
Li Xiaoye S.
Napov Artem
Rouet Francois-Henry
Williams Samuel
Publication venue
Publication date: 25/02/2015
Field of study

We present a sparse linear system solver that is based on a multifrontal variant of Gaussian elimination, and exploits low-rank approximation of the resulting dense frontal matrices. We use hierarchically semiseparable (HSS) matrices, which have low-rank off-diagonal blocks, to approximate the frontal matrices. For HSS matrix construction, a randomized sampling algorithm is used together with interpolative decompositions. The combination of the randomized compression with a fast ULV HSS factorization leads to a solver with lower computational complexity than the standard multifrontal method for many applications, resulting in speedups up to 7 fold for problems in our test suite. The implementation targets many-core systems by using task parallelism with dynamic runtime scheduling. Numerical experiments show performance improvements over state-of-the-art sparse direct solvers. The implementation achieves high performance and good scalability on a range of modern shared memory parallel systems, including the Intel Xeon Phi (MIC). The code is part of a software package called STRUMPACK -- STRUctured Matrices PACKage, which also has a distributed memory component for dense rank-structured matrices

arXiv.org e-Print Archive

Crossref

eScholarship - University of California

DI-fusion

An Incremental Algorithm for Computing Cylindrical Algebraic Decompositions

Author: Chen Changbo
Maza Marc Moreno
Publication venue
Publication date: 19/10/2012
Field of study

In this paper, we propose an incremental algorithm for computing cylindrical algebraic decompositions. The algorithm consists of two parts: computing a complex cylindrical tree and refining this complex tree into a cylindrical tree in real space. The incrementality comes from the first part of the algorithm, where a complex cylindrical tree is constructed by refining a previous complex cylindrical tree with a polynomial constraint. We have implemented our algorithm in Maple. The experimentation shows that the proposed algorithm outperforms existing ones for many examples taken from the literature

arXiv.org e-Print Archive

CiteSeerX

Using the distribution of cells by dimension in a cylindrical algebraic decomposition

Author: Bradford Russell
Davenport James H.
England Matthew
Wilson David
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

We investigate the distribution of cells by dimension in cylindrical algebraic decompositions (CADs). We find that they follow a standard distribution which seems largely independent of the underlying problem or CAD algorithm used. Rather, the distribution is inherent to the cylindrical structure and determined mostly by the number of variables. This insight is then combined with an algorithm that produces only full-dimensional cells to give an accurate method of predicting the number of cells in a complete CAD. Since constructing only full-dimensional cells is relatively inexpensive (involving no costly algebraic number calculations) this leads to heuristics for helping with various questions of problem formulation for CAD, such as choosing an optimal variable ordering. Our experiments demonstrate that this approach can be highly effective.Comment: 8 page

arXiv.org e-Print Archive

OPUS

Crossref

On the Parallelization of Vector Fitting Algorithms

Author: Chinea Alessandro
Grivet-Talocia S.
Publication venue: IEEE
Publication date: 01/01/2011
Field of study

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Fast Parallel Randomized QR with Column Pivoting Algorithms for Reliable Low-rank Matrix Approximations

Author: Gu Ming
Langou Julien
Xiao Jianwei
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 13/04/2018
Field of study

Factorizing large matrices by QR with column pivoting (QRCP) is substantially more expensive than QR without pivoting, owing to communication costs required for pivoting decisions. In contrast, randomized QRCP (RQRCP) algorithms have proven themselves empirically to be highly competitive with high-performance implementations of QR in processing time, on uniprocessor and shared memory machines, and as reliable as QRCP in pivot quality. We show that RQRCP algorithms can be as reliable as QRCP with failure probabilities exponentially decaying in oversampling size. We also analyze efficiency differences among different RQRCP algorithms. More importantly, we develop distributed memory implementations of RQRCP that are significantly better than QRCP implementations in ScaLAPACK. As a further development, we introduce the concept of and develop algorithms for computing spectrum-revealing QR factorizations for low-rank matrix approximations, and demonstrate their effectiveness against leading low-rank approximation methods in both theoretical and numerical reliability and efficiency.Comment: 11 pages, 14 figures, accepted by 2017 IEEE 24th International Conference on High Performance Computing (HiPC), awarded the best paper priz

arXiv.org e-Print Archive

Crossref