8 research outputs found

    Analytical Methods for Structured Matrix Computations

    Get PDF
    The design of fast algorithms is not only about achieving faster speeds but also about retaining the ability to control the error and numerical stability. This is crucial to the reliability of computed numerical solutions. This dissertation studies topics related to structured matrix computations with an emphasis on their numerical analysis aspects and algorithms. The methods discussed here are all based on rich analytical results that are mathematically justified. In chapter 2, we present a series of comprehensive error analyses to an analytical matrix compression method and it serves as a theoretical explanation of the proxy point method. These results are also important instructions on optimizing the performance. In chapter 3, we propose a non-Hermitian eigensolver by combining HSS matrix techniques with a contour-integral based method. Moreover, probabilistic analysis enables further acceleration of the method in addition to manipulating the HSS representation algebraically. An application of the HSS matrix is discussed in chapter 4 where we design a structured preconditioner for linear systems generated by AIIM. We improve the numerical stability for the matrix-free HSS construction process and make some additional modifications tailored to this particular problem

    Robust algebraic Schur complement preconditioners based on low rank corrections

    Get PDF
    In this paper we introduce LORASC, a robust algebraic preconditioner for solving sparse linear systems of equations involving symmetric and positive definite matrices. The graph of the input matrix is partitioned by using k-way partitioning with vertex separators into N disjoint domains and a separator formed by the vertices connecting the N domains. The obtained permuted matrix has a block arrow structure. The preconditioner relies on the Cholesky factorization of the first N diagonal blocks and on approximating the Schur complement corresponding to the separator block. The approximation of the Schur complement involves the factorization of the last diagonal block and a low rank correction obtained by solving a generalized eigenvalue problem or a randomized algorithm. The preconditioner can be build and applied in parallel. Numerical results on a set of matrices arising from the discretization by the finite element method of linear elasticity models illustrate the robusteness and the efficiency of our preconditioner

    Conditioning Analysis of Incomplete Cholesky Factorizations with Orthogonal Dropping

    Full text link

    Improving multifrontal solvers by means of algebraic Block Low-Rank representations

    Get PDF
    We consider the solution of large sparse linear systems by means of direct factorization based on a multifrontal approach. Although numerically robust and easy to use (it only needs algebraic information: the input matrix A and a right-hand side b, even if it can also digest preprocessing strategies based on geometric information), direct factorization methods are computationally intensive both in terms of memory and operations, which limits their scope on very large problems (matrices with up to few hundred millions of equations). This work focuses on exploiting low-rank approximations on multifrontal based direct methods to reduce both the memory footprints and the operation count, in sequential and distributed-memory environments, on a wide class of problems. We first survey the low-rank formats which have been previously developed to efficiently represent dense matrices and have been widely used to design fast solutions of partial differential equations, integral equations and eigenvalue problems. These formats are hierarchical (H and Hierarchically Semiseparable matrices are the most common ones) and have been (both theoretically and practically) shown to substantially decrease the memory and operation requirements for linear algebra computations. However, they impose many structural constraints which can limit their scope and efficiency, especially in the context of general purpose multifrontal solvers. We propose a flat format called Block Low-Rank (BLR) based on a natural blocking of the matrices and explain why it provides all the flexibility needed by a general purpose multifrontal solver in terms of numerical pivoting for stability and parallelism. We compare BLR format with other formats and show that BLR does not compromise much the memory and operation improvements achieved through low-rank approximations. A stability study shows that the approximations are well controlled by an explicit numerical parameter called low-rank threshold, which is critical in order to solve the sparse linear system accurately. Details on how Block Low-Rank factorizations can be efficiently implemented within multifrontal solvers are then given. We propose several Block Low-Rank factorization algorithms which allow for different types of gains. The proposed algorithms have been implemented within the MUMPS (MUltifrontal Massively Parallel Solver) solver. We first report experiments on standard partial differential equations based problems to analyse the main features of our BLR algorithms and to show the potential and flexibility of the approach; a comparison with a Hierarchically SemiSeparable code is also given. Then, Block Low-Rank formats are experimented on large (up to a hundred millions of unknowns) and various problems coming from several industrial applications. We finally illustrate the use of our approach as a preconditioning method for the Conjugate Gradient

    Amélioration des solveurs multifrontaux à l'aide de représentations algébriques rang-faible par blocs

    Get PDF
    We consider the solution of large sparse linear systems by means of direct factorization based on a multifrontal approach. Although numerically robust and easy to use (it only needs algebraic information: the input matrix A and a right-hand side b, even if it can also digest preprocessing strategies based on geometric information), direct factorization methods are computationally intensive both in terms of memory and operations, which limits their scope on very large problems (matrices with up to few hundred millions of equations). This work focuses on exploiting low-rank approximations on multifrontal based direct methods to reduce both the memory footprints and the operation count, in sequential and distributed-memory environments, on a wide class of problems. We first survey the low-rank formats which have been previously developed to efficiently represent dense matrices and have been widely used to design fast solutions of partial differential equations, integral equations and eigenvalue problems. These formats are hierarchical (H and Hierarchically Semiseparable matrices are the most common ones) and have been (both theoretically and practically) shown to substantially decrease the memory and operation requirements for linear algebra computations. However, they impose many structural constraints which can limit their scope and efficiency, especially in the context of general purpose multifrontal solvers. We propose a flat format called Block Low-Rank (BLR) based on a natural blocking of the matrices and explain why it provides all the flexibility needed by a general purpose multifrontal solver in terms of numerical pivoting for stability and parallelism. We compare BLR format with other formats and show that BLR does not compromise much the memory and operation improvements achieved through low-rank approximations. A stability study shows that the approximations are well controlled by an explicit numerical parameter called low-rank threshold, which is critical in order to solve the sparse linear system accurately. Details on how Block Low-Rank factorizations can be efficiently implemented within multifrontal solvers are then given. We propose several Block Low-Rank factorization algorithms which allow for different types of gains. The proposed algorithms have been implemented within the MUMPS (MUltifrontal Massively Parallel Solver) solver. We first report experiments on standard partial differential equations based problems to analyse the main features of our BLR algorithms and to show the potential and flexibility of the approach; a comparison with a Hierarchically SemiSeparable code is also given. Then, Block Low-Rank formats are experimented on large (up to a hundred millions of unknowns) and various problems coming from several industrial applications. We finally illustrate the use of our approach as a preconditioning method for the Conjugate Gradient.Nous considĂ©rons la rĂ©solution de trĂšs grands systĂšmes linĂ©aires creux Ă  l'aide d'une mĂ©thode de factorisation directe appelĂ©e mĂ©thode multifrontale. Bien que numĂ©riquement robustes et faciles Ă  utiliser (elles ne nĂ©cessitent que des informations algĂ©briques : la matrice d'entrĂ©e A et le second membre b, mĂȘme si elles peuvent exploiter des stratĂ©gies de prĂ©traitement basĂ©es sur des informations gĂ©omĂ©triques), les mĂ©thodes directes sont trĂšs coĂ»teuses en termes de mĂ©moire et d'opĂ©rations, ce qui limite leur applicabilitĂ© Ă  des problĂšmes de taille raisonnable (quelques millions d'Ă©quations). Cette Ă©tude se concentre sur l'exploitation des approximations de rang-faible dans la mĂ©thode multifrontale, pour rĂ©duire sa consommation mĂ©moire et son volume d'opĂ©rations, dans des environnements sĂ©quentiel et Ă  mĂ©moire distribuĂ©e, sur une large classe de problĂšmes. D'abord, nous examinons les formats rang-faible qui ont dĂ©jĂ  Ă©tĂ© dĂ©veloppĂ© pour reprĂ©senter efficacement les matrices denses et qui ont Ă©tĂ© utilisĂ©es pour concevoir des solveur rapides pour les Ă©quations aux dĂ©rivĂ©es partielles, les Ă©quations intĂ©grales et les problĂšmes aux valeurs propres. Ces formats sont hiĂ©rarchiques (les formats H et HSS sont les plus rĂ©pandus) et il a Ă©tĂ© prouvĂ©, en thĂ©orie et en pratique, qu'ils permettent de rĂ©duire substantiellement les besoins en mĂ©moire et opĂ©ration des calculs d'algĂšbre linĂ©aire. Cependant, de nombreuses contraintes structurelles sont imposĂ©es sur les problĂšmes visĂ©s, ce qui peut limiter leur efficacitĂ© et leur applicabilitĂ© aux solveurs multifrontaux gĂ©nĂ©raux. Nous proposons un format plat appelĂ© Block Rang-Faible (BRF) basĂ© sur un dĂ©coupage naturel de la matrice en blocs et expliquons pourquoi il fournit toute la flexibilitĂ© nĂ©cĂ©ssaire Ă  son utilisation dans un solveur multifrontal gĂ©nĂ©ral, en terme de pivotage numĂ©rique et de parallĂ©lisme. Nous comparons le format BRF avec les autres et montrons que le format BRF ne compromet que peu les amĂ©liorations en mĂ©moire et opĂ©ration obtenues grĂące aux approximations rang-faible. Une Ă©tude de stabilitĂ© montre que les approximations sont bien contrĂŽlĂ©es par un paramĂštre numĂ©rique explicite appelĂ© le seuil rang-faible, ce qui est critique dans l'optique de rĂ©soudre des systĂšmes linĂ©aires creux avec prĂ©cision. Ensuite, nous expliquons comment les factorisations exploitant le format BRF peuvent ĂȘtre efficacement implĂ©mentĂ©es dans les solveurs multifrontaux. Nous proposons plusieurs algorithmes de factorisation BRF, ce qui permet d'atteindre diffĂ©rents objectifs. Les algorithmes proposĂ©s ont Ă©tĂ© implĂ©mentĂ©s dans le solveur multifrontal MUMPS. Nous prĂ©sentons tout d'abord des expĂ©riences effectuĂ©es avec des Ă©quations aux dĂ©rivĂ©es partielles standardes pour analyser les principales propriĂ©tĂ©s des algorithms BRF et montrer le potentiel et la flexibilitĂ© de l'approche ; une comparaison avec un code basĂ© sur le format HSS est Ă©galement fournie. Ensuite, nous expĂ©rimentons le format BRF sur des problĂšmes variĂ©s et de grande taille (jusqu'Ă  une centaine de millions d'inconnues), provenant de nombreuses applications industrielles. Pour finir, nous illustrons l'utilisation de notre approche en tant que prĂ©conditionneur pour la mĂ©thode du Gradient ConjuguĂ©

    Fast algorithms for Brownian dynamics simulation with hydrodynamic interactions

    Get PDF
    In the Brownian dynamics simulation with hydrodynamic interactions, one needs to generate the total displacement vectors of Brownian particles consisting of two parts: a deterministic part which is proportional to the product of the Rotne-Prager-Yamakawa (RPY) tensor D [1, 2] and the given external forces F; and a hydrodynamically correlated random part whose covariance is proportional to the RPY tensor. To be more precise, one needs to calculate Du for a given vector u and compute √Dv for a normally distributed random vector v. For an arbitrary N-particle configuration, D is a 3N x 3N matrix and u, v are vectors of length 3N. Thus, classical algorithms require O(N2) operations for computing Du and O(N3) operations for computing √Dv, which are prohibitively expensive and render large scale simulations impossible since one needs to carry out these calculations many times in a Brownian dynamics simulation. In this dissertation, we first present two fast multipole methods (FMM) for computing Du. The first FMM is a simple application of the kernel independent FMM (KIFMM) developed by Ying, Biros, and Zorin [3], which requires 9 scalar FMM calls. The second FMM, similar to the FMM for Stokeslet developed by Tornberg and Greengard [4], decomposes the RPY tensor into harmonic potentials and its derivatives, and thus requires only four harmonic FMM calls. Both FMMs reduce the computational cost of Du from O(N2) to O(N) for an arbitrary N-particle configuration. We then discuss several methods of computing √Dv, which are all based on the Krylov subspace approximations, that is, replacing √Dv by p(D)v with p(D) a low degree polynomial in D. We first show rigorously that the popular Chebyshev spectral approximation method (see, for example, [5, 6]) requires √Îș log 1/Δ terms for a desired precision E, where K is the condition number of the RPY tensor D. In the Chebyshev spectral approximation method, one also needs to estimate the extreme eigenvalues of D. We have considered several methods: the classical Lanczos method, the Chebyshev-Davidson method, and the safeguarded Lanczos method proposed by Zhou and Li [7]. Our numerical experiments indicate that K is usually very small when the particles are distributed uniformly with low density, and that the safeguarded Lanczos method is most effective for our cases with very little additional computational cost. Thus, when combined with the FMMs we described earlier, the Chebyshev approximation method with safeguarded Lanczos method as eigenvalue estimators essentially reduces the cost of computing √Dv from O(N3) to O(N) for most practical particle configurations. Finally, we propose to combine the so-called spectral Lanczos decomposition method (SLDM) (see, for example, [8]) and the FMMs to compute √Dv. Our numerical experiments show that the SLDM is generally more efficient than the popular Chebyshev spectral approximation method. The fast algorithms developed in this dissertation will be useful for the study of diffusion limited reactions, polymer dynamics, protein folding, and particle coagulation as it enables large scale Brownian dynamics simulations. Moreover, the algorithms can be extended to speed up the computation involving the matrix square root for many other matrices, which has potential applications in areas such as statistical analysis with certain spatial correlations and model reduction in dynamic control theory
    corecore