5,119 research outputs found
Performance analysis of electronic structure codes on HPC systems: A case study of SIESTA
We report on scaling and timing tests of the SIESTA electronic structure code
for ab initio molecular dynamics simulations using density-functional theory.
The tests are performed on six large-scale supercomputers belonging to the
PRACE Tier-0 network with four different architectures: Cray XE6, IBM
BlueGene/Q, BullX, and IBM iDataPlex. We employ a systematic strategy for
simultaneously testing weak and strong scaling, and propose a measure which is
independent of the range of number of cores on which the tests are performed to
quantify strong scaling efficiency as a function of simulation size. We find an
increase in efficiency with simulation size for all machines, with a
qualitatively different curve depending on the supercomputer topology, and
discuss the connection of this functional form with weak scaling behaviour. We
also analyze the absolute timings obtained in our tests, showing the range of
system sizes and cores favourable for different machines. Our results can be
employed as a guide both for running SIESTA on parallel architectures, and for
executing similar scaling tests of other electronic structure codes.Comment: 9 pages, 9 figure
Increasing the Efficiency of Sparse Matrix-Matrix Multiplication with a 2.5D Algorithm and One-Sided MPI
Matrix-matrix multiplication is a basic operation in linear algebra and an
essential building block for a wide range of algorithms in various scientific
fields. Theory and implementation for the dense, square matrix case are
well-developed. If matrices are sparse, with application-specific sparsity
patterns, the optimal implementation remains an open question. Here, we explore
the performance of communication reducing 2.5D algorithms and one-sided MPI
communication in the context of linear scaling electronic structure theory. In
particular, we extend the DBCSR sparse matrix library, which is the basic
building block for linear scaling electronic structure theory and low scaling
correlated methods in CP2K. The library is specifically designed to efficiently
perform block-sparse matrix-matrix multiplication of matrices with a relatively
large occupation. Here, we compare the performance of the original
implementation based on Cannon's algorithm and MPI point-to-point
communication, with an implementation based on MPI one-sided communications
(RMA), in both a 2D and a 2.5D approach. The 2.5D approach trades memory and
auxiliary operations for reduced communication, which can lead to a speedup if
communication is dominant. The 2.5D algorithm is somewhat easier to implement
with one-sided communications. A detailed description of the implementation is
provided, also for non ideal processor topologies, since this is important for
actual applications. Given the importance of the precise sparsity pattern, and
even the actual matrix data, which decides the effective fill-in upon
multiplication, the tests are performed within the CP2K package with
application benchmarks. Results show a substantial boost in performance for the
RMA based 2.5D algorithm, up to 1.80x, which is observed to increase with the
number of involved processes in the parallelization.Comment: In Proceedings of PASC '17, Lugano, Switzerland, June 26-28, 2017, 10
pages, 4 figure
ELSI: A Unified Software Interface for Kohn-Sham Electronic Structure Solvers
Solving the electronic structure from a generalized or standard eigenproblem
is often the bottleneck in large scale calculations based on Kohn-Sham
density-functional theory. This problem must be addressed by essentially all
current electronic structure codes, based on similar matrix expressions, and by
high-performance computation. We here present a unified software interface,
ELSI, to access different strategies that address the Kohn-Sham eigenvalue
problem. Currently supported algorithms include the dense generalized
eigensolver library ELPA, the orbital minimization method implemented in
libOMM, and the pole expansion and selected inversion (PEXSI) approach with
lower computational complexity for semilocal density functionals. The ELSI
interface aims to simplify the implementation and optimal use of the different
strategies, by offering (a) a unified software framework designed for the
electronic structure solvers in Kohn-Sham density-functional theory; (b)
reasonable default parameters for a chosen solver; (c) automatic conversion
between input and internal working matrix formats, and in the future (d)
recommendation of the optimal solver depending on the specific problem.
Comparative benchmarks are shown for system sizes up to 11,520 atoms (172,800
basis functions) on distributed memory supercomputing architectures.Comment: 55 pages, 14 figures, 2 table
An electronic model for self-assembled hybrid organic/perovskite semiconductors: reverse band edge electronic states ordering and spin-orbit coupling
Based on density functional theory, the electronic and optical properties of
hybrid organic/perovskite crystals are thoroughly investigated. We consider the
mono-crystalline 4FPEPI as material model and demonstrate the optical process
is governed by three active Bloch states at the {\Gamma} point of the reduced
Brillouin zone with a reverse ordering compared to tetrahedrally bonded
semiconductors. Giant spin-orbit coupling effects and optical activities are
subsequently inferred from symmetry analysis.Comment: 17 pages, 6 figure
- …