75,683 research outputs found
A Note on the Column Elimination Tree
This short communication considers the LU factorization with partial pivoting and shows that an all-at-once result is possible for the structure prediction of the column dependencies in L and U. Specifically, we prove that for every square strong Hall matrix A there exists a permutation P such that every edge of its column elimination tree corresponds to a symbolic nonzero in the upper triangular factor U. In the symbolic sense, this resolves a conjecture of Gilbert and Ng [6]
Fast Evaluation of Interlace Polynomials on Graphs of Bounded Treewidth
We consider the multivariate interlace polynomial introduced by Courcelle
(2008), which generalizes several interlace polynomials defined by Arratia,
Bollobas, and Sorkin (2004) and by Aigner and van der Holst (2004). We present
an algorithm to evaluate the multivariate interlace polynomial of a graph with
n vertices given a tree decomposition of the graph of width k. The best
previously known result (Courcelle 2008) employs a general logical framework
and leads to an algorithm with running time f(k)*n, where f(k) is doubly
exponential in k. Analyzing the GF(2)-rank of adjacency matrices in the context
of tree decompositions, we give a faster and more direct algorithm. Our
algorithm uses 2^{3k^2+O(k)}*n arithmetic operations and can be efficiently
implemented in parallel.Comment: v4: Minor error in Lemma 5.5 fixed, Section 6.6 added, minor
improvements. 44 pages, 14 figure
Tiled QR factorization algorithms
This work revisits existing algorithms for the QR factorization of
rectangular matrices composed of p-by-q tiles, where p >= q. Within this
framework, we study the critical paths and performance of algorithms such as
Sameh and Kuck, Modi and Clarke, Greedy, and those found within PLASMA.
Although neither Modi and Clarke nor Greedy is optimal, both are shown to be
asymptotically optimal for all matrices of size p = q^2 f(q), where f is any
function such that \lim_{+\infty} f= 0. This novel and important complexity
result applies to all matrices where p and q are proportional, p = \lambda q,
with \lambda >= 1, thereby encompassing many important situations in practice
(least squares). We provide an extensive set of experiments that show the
superiority of the new algorithms for tall matrices
An efficient multi-core implementation of a novel HSS-structured multifrontal solver using randomized sampling
We present a sparse linear system solver that is based on a multifrontal
variant of Gaussian elimination, and exploits low-rank approximation of the
resulting dense frontal matrices. We use hierarchically semiseparable (HSS)
matrices, which have low-rank off-diagonal blocks, to approximate the frontal
matrices. For HSS matrix construction, a randomized sampling algorithm is used
together with interpolative decompositions. The combination of the randomized
compression with a fast ULV HSS factorization leads to a solver with lower
computational complexity than the standard multifrontal method for many
applications, resulting in speedups up to 7 fold for problems in our test
suite. The implementation targets many-core systems by using task parallelism
with dynamic runtime scheduling. Numerical experiments show performance
improvements over state-of-the-art sparse direct solvers. The implementation
achieves high performance and good scalability on a range of modern shared
memory parallel systems, including the Intel Xeon Phi (MIC). The code is part
of a software package called STRUMPACK -- STRUctured Matrices PACKage, which
also has a distributed memory component for dense rank-structured matrices
Fast Algorithms for Displacement and Low-Rank Structured Matrices
This tutorial provides an introduction to the development of fast matrix
algorithms based on the notions of displacement and various low-rank
structures
- …