3,113 research outputs found

    Locality-aware parallel block-sparse matrix-matrix multiplication using the Chunks and Tasks programming model

    Full text link
    We present a method for parallel block-sparse matrix-matrix multiplication on distributed memory clusters. By using a quadtree matrix representation, data locality is exploited without prior information about the matrix sparsity pattern. A distributed quadtree matrix representation is straightforward to implement due to our recent development of the Chunks and Tasks programming model [Parallel Comput. 40, 328 (2014)]. The quadtree representation combined with the Chunks and Tasks model leads to favorable weak and strong scaling of the communication cost with the number of processes, as shown both theoretically and in numerical experiments. Matrices are represented by sparse quadtrees of chunk objects. The leaves in the hierarchy are block-sparse submatrices. Sparsity is dynamically detected by the matrix library and may occur at any level in the hierarchy and/or within the submatrix leaves. In case graphics processing units (GPUs) are available, both CPUs and GPUs are used for leaf-level multiplication work, thus making use of the full computing capacity of each node. The performance is evaluated for matrices with different sparsity structures, including examples from electronic structure calculations. Compared to methods that do not exploit data locality, our locality-aware approach reduces communication significantly, achieving essentially constant communication per node in weak scaling tests.Comment: 35 pages, 14 figure

    O(N) methods in electronic structure calculations

    Full text link
    Linear scaling methods, or O(N) methods, have computational and memory requirements which scale linearly with the number of atoms in the system, N, in contrast to standard approaches which scale with the cube of the number of atoms. These methods, which rely on the short-ranged nature of electronic structure, will allow accurate, ab initio simulations of systems of unprecedented size. The theory behind the locality of electronic structure is described and related to physical properties of systems to be modelled, along with a survey of recent developments in real-space methods which are important for efficient use of high performance computers. The linear scaling methods proposed to date can be divided into seven different areas, and the applicability, efficiency and advantages of the methods proposed in these areas is then discussed. The applications of linear scaling methods, as well as the implementations available as computer programs, are considered. Finally, the prospects for and the challenges facing linear scaling methods are discussed.Comment: 85 pages, 15 figures, 488 references. Resubmitted to Rep. Prog. Phys (small changes
    • …
    corecore