21,686 research outputs found
Tiled QR factorization algorithms
This work revisits existing algorithms for the QR factorization of
rectangular matrices composed of p-by-q tiles, where p >= q. Within this
framework, we study the critical paths and performance of algorithms such as
Sameh and Kuck, Modi and Clarke, Greedy, and those found within PLASMA.
Although neither Modi and Clarke nor Greedy is optimal, both are shown to be
asymptotically optimal for all matrices of size p = q^2 f(q), where f is any
function such that \lim_{+\infty} f= 0. This novel and important complexity
result applies to all matrices where p and q are proportional, p = \lambda q,
with \lambda >= 1, thereby encompassing many important situations in practice
(least squares). We provide an extensive set of experiments that show the
superiority of the new algorithms for tall matrices
LU factorization with panel rank revealing pivoting and its communication avoiding version
We present the LU decomposition with panel rank revealing pivoting (LU_PRRP),
an LU factorization algorithm based on strong rank revealing QR panel
factorization. LU_PRRP is more stable than Gaussian elimination with partial
pivoting (GEPP). Our extensive numerical experiments show that the new
factorization scheme is as numerically stable as GEPP in practice, but it is
more resistant to pathological cases and easily solves the Wilkinson matrix and
the Foster matrix. We also present CALU_PRRP, a communication avoiding version
of LU_PRRP that minimizes communication. CALU_PRRP is based on tournament
pivoting, with the selection of the pivots at each step of the tournament being
performed via strong rank revealing QR factorization. CALU_PRRP is more stable
than CALU, the communication avoiding version of GEPP. CALU_PRRP is also more
stable in practice and is resistant to pathological cases on which GEPP and
CALU fail.Comment: No. RR-7867 (2012
Direct QR factorizations for tall-and-skinny matrices in MapReduce architectures
The QR factorization and the SVD are two fundamental matrix decompositions
with applications throughout scientific computing and data analysis. For
matrices with many more rows than columns, so-called "tall-and-skinny
matrices," there is a numerically stable, efficient, communication-avoiding
algorithm for computing the QR factorization. It has been used in traditional
high performance computing and grid computing environments. For MapReduce
environments, existing methods to compute the QR decomposition use a
numerically unstable approach that relies on indirectly computing the Q factor.
In the best case, these methods require only two passes over the data. In this
paper, we describe how to compute a stable tall-and-skinny QR factorization on
a MapReduce architecture in only slightly more than 2 passes over the data. We
can compute the SVD with only a small change and no difference in performance.
We present a performance comparison between our new direct TSQR method, a
standard unstable implementation for MapReduce (Cholesky QR), and the classic
stable algorithm implemented for MapReduce (Householder QR). We find that our
new stable method has a large performance advantage over the Householder QR
method. This holds both in a theoretical performance model as well as in an
actual implementation
The use of the QR factorization in the partial realization problem
The use of the QR factorization of the Hankel matrix in solving the partial realization problem is analyzed. Straightforward use of the QR factorization results in a realization scheme that possesses all of the computational advantages of Rissanen's realization scheme. These latter properties are computational efficiency, recursiveness, use of limited computer memory, and the realization of a system triplet having a condensed structure. Moreover, this scheme is robust when the order of the system corresponds to the rank of the Hankel matrix. When this latter condition is violated, an approximate realization could be determined via the QR factorization. In this second scheme, the given Hankel matrix is approximated by a low-rank non-Hankel matrix. Furthermore, it is demonstrated that column pivoting might be incorporated in this second scheme. The results presented are derived for a single input/single output system, but this does not seem to be a restriction
- …
