2,483 research outputs found
A Massively Parallel Algorithm for the Approximate Calculation of Inverse p-th Roots of Large Sparse Matrices
We present the submatrix method, a highly parallelizable method for the
approximate calculation of inverse p-th roots of large sparse symmetric
matrices which are required in different scientific applications. We follow the
idea of Approximate Computing, allowing imprecision in the final result in
order to be able to utilize the sparsity of the input matrix and to allow
massively parallel execution. For an n x n matrix, the proposed algorithm
allows to distribute the calculations over n nodes with only little
communication overhead. The approximate result matrix exhibits the same
sparsity pattern as the input matrix, allowing for efficient reuse of allocated
data structures.
We evaluate the algorithm with respect to the error that it introduces into
calculated results, as well as its performance and scalability. We demonstrate
that the error is relatively limited for well-conditioned matrices and that
results are still valuable for error-resilient applications like
preconditioning even for ill-conditioned matrices. We discuss the execution
time and scaling of the algorithm on a theoretical level and present a
distributed implementation of the algorithm using MPI and OpenMP. We
demonstrate the scalability of this implementation by running it on a
high-performance compute cluster comprised of 1024 CPU cores, showing a speedup
of 665x compared to single-threaded execution
Efficient Computation of Sparse Matrix Functions for Large-Scale Electronic Structure Calculations: The CheSS Library
We present CheSS, the “Chebyshev Sparse Solvers” library, which has been designed to solve typical problems arising in large-scale electronic structure calculations using localized basis sets. The library is based on a flexible and efficient expansion in terms of Chebyshev polynomials and presently features the calculation of the density matrix, the calculation of matrix powers for arbitrary powers, and the extraction of eigenvalues in a selected interval. CheSS is able to exploit the sparsity of the matrices and scales linearly with respect to the number of nonzero entries, making it well-suited for large-scale calculations. The approach is particularly adapted for setups leading to small spectral widths of the involved matrices and outperforms alternative methods in this regime. By coupling CheSS to the DFT code BigDFT, we show that such a favorable setup is indeed possible in practice. In addition, the approach based on Chebyshev polynomials can be massively parallelized, and CheSS exhibits excellent scaling up to thousands of cores even for relatively small matrix sizes.We gratefully acknowledge the support of the MaX (SM) and POP (MW) projects, which have received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement
No. 676598 and 676553, respectively. This work was also supported by the Energy oriented Centre of Excellence (EoCoE), grant agreement number 676629, funded within the Horizon2020 framework of the European
Union, as well as by the Next-Generation Supercomputer project (the K computer project) and the FLAGSHIP2020 within the priority study5 (Development of new fundamental technologies for high-efficiency
energy creation, conversion/storage and use) from the Ministry of Education, Culture, Sports, Science and Technology (MEXT) of Japan. We (LG, DC, WD, TN) gratefully acknowledge the joint CEA-RIKEN collaboration action.Peer ReviewedPostprint (author's final draft
Three real-space discretization techniques in electronic structure calculations
A characteristic feature of the state-of-the-art of real-space methods in
electronic structure calculations is the diversity of the techniques used in
the discretization of the relevant partial differential equations. In this
context, the main approaches include finite-difference methods, various types
of finite-elements and wavelets. This paper reports on the results of several
code development projects that approach problems related to the electronic
structure using these three different discretization methods. We review the
ideas behind these methods, give examples of their applications, and discuss
their similarities and differences.Comment: 39 pages, 10 figures, accepted to a special issue of "physica status
solidi (b) - basic solid state physics" devoted to the CECAM workshop "State
of the art developments and perspectives of real-space electronic structure
techniques in condensed matter and molecular physics". v2: Minor stylistic
and typographical changes, partly inspired by referee comment
Fast iterative solution of the Bethe–Salpeter eigenvalue problem using low-rank and QTT tensor approximation
In this paper, we propose and study two approaches to approximate the solution of the Bethe–Salpeter equation (BSE) by using structured iterative eigenvalue solvers. Both approaches are based on the reduced basis method and low-rank factorizations of the generating matrices. We also propose to represent the static screen interaction part in the BSE matrix by a small active sub-block, with a size balancing the storage for rank-structured representations of other matrix blocks. We demonstrate by various numerical tests that the combination of the diagonal plus low-rank plus reduced-block approximation exhibits higher precision with low numerical cost, providing as well a distinct two-sided error estimate for the smallest eigenvalues of the Bethe–Salpeter operator. The complexity is reduced to O(Nb 2) in the size of the atomic orbitals basis set, Nb, instead of the practically intractable O(Nb 6) scaling for the direct diagonalization. In the second approach, we apply the quantized-TT (QTT) tensor representation to both, the long eigenvectors and the column vectors in the rank-structured BSE matrix blocks, and combine this with the ALS-type iteration in block QTT format. The QTT-rank of the matrix entities possesses almost the same magnitude as the number of occupied orbitals in the molecular systems, Nob, hence the overall asymptotic complexity for solving the BSE problem by the QTT approximation is estimated by O(log(No)No 2). We confirm numerically a considerable decrease in computational time for the presented iterative approaches applied to various compact and chain-type molecules, while supporting sufficient accuracy.</p
Sweeping Preconditioner for the Helmholtz Equation: Moving Perfectly Matched Layers
This paper introduces a new sweeping preconditioner for the iterative
solution of the variable coefficient Helmholtz equation in two and three
dimensions. The algorithms follow the general structure of constructing an
approximate factorization by eliminating the unknowns layer by layer
starting from an absorbing layer or boundary condition. The central idea of
this paper is to approximate the Schur complement matrices of the factorization
using moving perfectly matched layers (PMLs) introduced in the interior of the
domain. Applying each Schur complement matrix is equivalent to solving a
quasi-1D problem with a banded LU factorization in the 2D case and to solving a
quasi-2D problem with a multifrontal method in the 3D case. The resulting
preconditioner has linear application cost and the preconditioned iterative
solver converges in a number of iterations that is essentially indefinite of
the number of unknowns or the frequency. Numerical results are presented in
both two and three dimensions to demonstrate the efficiency of this new
preconditioner.Comment: 25 page
- …