3,113 research outputs found
Locality-aware parallel block-sparse matrix-matrix multiplication using the Chunks and Tasks programming model
We present a method for parallel block-sparse matrix-matrix multiplication on
distributed memory clusters. By using a quadtree matrix representation, data
locality is exploited without prior information about the matrix sparsity
pattern. A distributed quadtree matrix representation is straightforward to
implement due to our recent development of the Chunks and Tasks programming
model [Parallel Comput. 40, 328 (2014)]. The quadtree representation combined
with the Chunks and Tasks model leads to favorable weak and strong scaling of
the communication cost with the number of processes, as shown both
theoretically and in numerical experiments.
Matrices are represented by sparse quadtrees of chunk objects. The leaves in
the hierarchy are block-sparse submatrices. Sparsity is dynamically detected by
the matrix library and may occur at any level in the hierarchy and/or within
the submatrix leaves. In case graphics processing units (GPUs) are available,
both CPUs and GPUs are used for leaf-level multiplication work, thus making use
of the full computing capacity of each node.
The performance is evaluated for matrices with different sparsity structures,
including examples from electronic structure calculations. Compared to methods
that do not exploit data locality, our locality-aware approach reduces
communication significantly, achieving essentially constant communication per
node in weak scaling tests.Comment: 35 pages, 14 figure
O(N) methods in electronic structure calculations
Linear scaling methods, or O(N) methods, have computational and memory
requirements which scale linearly with the number of atoms in the system, N, in
contrast to standard approaches which scale with the cube of the number of
atoms. These methods, which rely on the short-ranged nature of electronic
structure, will allow accurate, ab initio simulations of systems of
unprecedented size. The theory behind the locality of electronic structure is
described and related to physical properties of systems to be modelled, along
with a survey of recent developments in real-space methods which are important
for efficient use of high performance computers. The linear scaling methods
proposed to date can be divided into seven different areas, and the
applicability, efficiency and advantages of the methods proposed in these areas
is then discussed. The applications of linear scaling methods, as well as the
implementations available as computer programs, are considered. Finally, the
prospects for and the challenges facing linear scaling methods are discussed.Comment: 85 pages, 15 figures, 488 references. Resubmitted to Rep. Prog. Phys
(small changes
- …