24,160 research outputs found

    An efficient multi-core implementation of a novel HSS-structured multifrontal solver using randomized sampling

    Full text link
    We present a sparse linear system solver that is based on a multifrontal variant of Gaussian elimination, and exploits low-rank approximation of the resulting dense frontal matrices. We use hierarchically semiseparable (HSS) matrices, which have low-rank off-diagonal blocks, to approximate the frontal matrices. For HSS matrix construction, a randomized sampling algorithm is used together with interpolative decompositions. The combination of the randomized compression with a fast ULV HSS factorization leads to a solver with lower computational complexity than the standard multifrontal method for many applications, resulting in speedups up to 7 fold for problems in our test suite. The implementation targets many-core systems by using task parallelism with dynamic runtime scheduling. Numerical experiments show performance improvements over state-of-the-art sparse direct solvers. The implementation achieves high performance and good scalability on a range of modern shared memory parallel systems, including the Intel Xeon Phi (MIC). The code is part of a software package called STRUMPACK -- STRUctured Matrices PACKage, which also has a distributed memory component for dense rank-structured matrices

    A distributed-memory package for dense Hierarchically Semi-Separable matrix computations using randomization

    Full text link
    We present a distributed-memory library for computations with dense structured matrices. A matrix is considered structured if its off-diagonal blocks can be approximated by a rank-deficient matrix with low numerical rank. Here, we use Hierarchically Semi-Separable representations (HSS). Such matrices appear in many applications, e.g., finite element methods, boundary element methods, etc. Exploiting this structure allows for fast solution of linear systems and/or fast computation of matrix-vector products, which are the two main building blocks of matrix computations. The compression algorithm that we use, that computes the HSS form of an input dense matrix, relies on randomized sampling with a novel adaptive sampling mechanism. We discuss the parallelization of this algorithm and also present the parallelization of structured matrix-vector product, structured factorization and solution routines. The efficiency of the approach is demonstrated on large problems from different academic and industrial applications, on up to 8,000 cores. This work is part of a more global effort, the STRUMPACK (STRUctured Matrices PACKage) software package for computations with sparse and dense structured matrices. Hence, although useful on their own right, the routines also represent a step in the direction of a distributed-memory sparse solver

    Optimal randomized multilevel algorithms for infinite-dimensional integration on function spaces with ANOVA-type decomposition

    Full text link
    In this paper, we consider the infinite-dimensional integration problem on weighted reproducing kernel Hilbert spaces with norms induced by an underlying function space decomposition of ANOVA-type. The weights model the relative importance of different groups of variables. We present new randomized multilevel algorithms to tackle this integration problem and prove upper bounds for their randomized error. Furthermore, we provide in this setting the first non-trivial lower error bounds for general randomized algorithms, which, in particular, may be adaptive or non-linear. These lower bounds show that our multilevel algorithms are optimal. Our analysis refines and extends the analysis provided in [F. J. Hickernell, T. M\"uller-Gronbach, B. Niu, K. Ritter, J. Complexity 26 (2010), 229-254], and our error bounds improve substantially on the error bounds presented there. As an illustrative example, we discuss the unanchored Sobolev space and employ randomized quasi-Monte Carlo multilevel algorithms based on scrambled polynomial lattice rules.Comment: 31 pages, 0 figure
    • …
    corecore