2,483 research outputs found

    A Massively Parallel Algorithm for the Approximate Calculation of Inverse p-th Roots of Large Sparse Matrices

    Get PDF
    We present the submatrix method, a highly parallelizable method for the approximate calculation of inverse p-th roots of large sparse symmetric matrices which are required in different scientific applications. We follow the idea of Approximate Computing, allowing imprecision in the final result in order to be able to utilize the sparsity of the input matrix and to allow massively parallel execution. For an n x n matrix, the proposed algorithm allows to distribute the calculations over n nodes with only little communication overhead. The approximate result matrix exhibits the same sparsity pattern as the input matrix, allowing for efficient reuse of allocated data structures. We evaluate the algorithm with respect to the error that it introduces into calculated results, as well as its performance and scalability. We demonstrate that the error is relatively limited for well-conditioned matrices and that results are still valuable for error-resilient applications like preconditioning even for ill-conditioned matrices. We discuss the execution time and scaling of the algorithm on a theoretical level and present a distributed implementation of the algorithm using MPI and OpenMP. We demonstrate the scalability of this implementation by running it on a high-performance compute cluster comprised of 1024 CPU cores, showing a speedup of 665x compared to single-threaded execution

    Efficient Computation of Sparse Matrix Functions for Large-Scale Electronic Structure Calculations: The CheSS Library

    Get PDF
    We present CheSS, the “Chebyshev Sparse Solvers” library, which has been designed to solve typical problems arising in large-scale electronic structure calculations using localized basis sets. The library is based on a flexible and efficient expansion in terms of Chebyshev polynomials and presently features the calculation of the density matrix, the calculation of matrix powers for arbitrary powers, and the extraction of eigenvalues in a selected interval. CheSS is able to exploit the sparsity of the matrices and scales linearly with respect to the number of nonzero entries, making it well-suited for large-scale calculations. The approach is particularly adapted for setups leading to small spectral widths of the involved matrices and outperforms alternative methods in this regime. By coupling CheSS to the DFT code BigDFT, we show that such a favorable setup is indeed possible in practice. In addition, the approach based on Chebyshev polynomials can be massively parallelized, and CheSS exhibits excellent scaling up to thousands of cores even for relatively small matrix sizes.We gratefully acknowledge the support of the MaX (SM) and POP (MW) projects, which have received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No. 676598 and 676553, respectively. This work was also supported by the Energy oriented Centre of Excellence (EoCoE), grant agreement number 676629, funded within the Horizon2020 framework of the European Union, as well as by the Next-Generation Supercomputer project (the K computer project) and the FLAGSHIP2020 within the priority study5 (Development of new fundamental technologies for high-efficiency energy creation, conversion/storage and use) from the Ministry of Education, Culture, Sports, Science and Technology (MEXT) of Japan. We (LG, DC, WD, TN) gratefully acknowledge the joint CEA-RIKEN collaboration action.Peer ReviewedPostprint (author's final draft

    Three real-space discretization techniques in electronic structure calculations

    Full text link
    A characteristic feature of the state-of-the-art of real-space methods in electronic structure calculations is the diversity of the techniques used in the discretization of the relevant partial differential equations. In this context, the main approaches include finite-difference methods, various types of finite-elements and wavelets. This paper reports on the results of several code development projects that approach problems related to the electronic structure using these three different discretization methods. We review the ideas behind these methods, give examples of their applications, and discuss their similarities and differences.Comment: 39 pages, 10 figures, accepted to a special issue of "physica status solidi (b) - basic solid state physics" devoted to the CECAM workshop "State of the art developments and perspectives of real-space electronic structure techniques in condensed matter and molecular physics". v2: Minor stylistic and typographical changes, partly inspired by referee comment

    Fast iterative solution of the Bethe–Salpeter eigenvalue problem using low-rank and QTT tensor approximation

    Get PDF
    In this paper, we propose and study two approaches to approximate the solution of the Bethe–Salpeter equation (BSE) by using structured iterative eigenvalue solvers. Both approaches are based on the reduced basis method and low-rank factorizations of the generating matrices. We also propose to represent the static screen interaction part in the BSE matrix by a small active sub-block, with a size balancing the storage for rank-structured representations of other matrix blocks. We demonstrate by various numerical tests that the combination of the diagonal plus low-rank plus reduced-block approximation exhibits higher precision with low numerical cost, providing as well a distinct two-sided error estimate for the smallest eigenvalues of the Bethe–Salpeter operator. The complexity is reduced to O(Nb 2) in the size of the atomic orbitals basis set, Nb, instead of the practically intractable O(Nb 6) scaling for the direct diagonalization. In the second approach, we apply the quantized-TT (QTT) tensor representation to both, the long eigenvectors and the column vectors in the rank-structured BSE matrix blocks, and combine this with the ALS-type iteration in block QTT format. The QTT-rank of the matrix entities possesses almost the same magnitude as the number of occupied orbitals in the molecular systems, Nob, hence the overall asymptotic complexity for solving the BSE problem by the QTT approximation is estimated by O(log⁡(No)No 2). We confirm numerically a considerable decrease in computational time for the presented iterative approaches applied to various compact and chain-type molecules, while supporting sufficient accuracy.</p

    Sweeping Preconditioner for the Helmholtz Equation: Moving Perfectly Matched Layers

    Full text link
    This paper introduces a new sweeping preconditioner for the iterative solution of the variable coefficient Helmholtz equation in two and three dimensions. The algorithms follow the general structure of constructing an approximate LDLtLDL^t factorization by eliminating the unknowns layer by layer starting from an absorbing layer or boundary condition. The central idea of this paper is to approximate the Schur complement matrices of the factorization using moving perfectly matched layers (PMLs) introduced in the interior of the domain. Applying each Schur complement matrix is equivalent to solving a quasi-1D problem with a banded LU factorization in the 2D case and to solving a quasi-2D problem with a multifrontal method in the 3D case. The resulting preconditioner has linear application cost and the preconditioned iterative solver converges in a number of iterations that is essentially indefinite of the number of unknowns or the frequency. Numerical results are presented in both two and three dimensions to demonstrate the efficiency of this new preconditioner.Comment: 25 page
    corecore