5,119 research outputs found

    Performance analysis of electronic structure codes on HPC systems: A case study of SIESTA

    Full text link
    We report on scaling and timing tests of the SIESTA electronic structure code for ab initio molecular dynamics simulations using density-functional theory. The tests are performed on six large-scale supercomputers belonging to the PRACE Tier-0 network with four different architectures: Cray XE6, IBM BlueGene/Q, BullX, and IBM iDataPlex. We employ a systematic strategy for simultaneously testing weak and strong scaling, and propose a measure which is independent of the range of number of cores on which the tests are performed to quantify strong scaling efficiency as a function of simulation size. We find an increase in efficiency with simulation size for all machines, with a qualitatively different curve depending on the supercomputer topology, and discuss the connection of this functional form with weak scaling behaviour. We also analyze the absolute timings obtained in our tests, showing the range of system sizes and cores favourable for different machines. Our results can be employed as a guide both for running SIESTA on parallel architectures, and for executing similar scaling tests of other electronic structure codes.Comment: 9 pages, 9 figure

    Increasing the Efficiency of Sparse Matrix-Matrix Multiplication with a 2.5D Algorithm and One-Sided MPI

    Full text link
    Matrix-matrix multiplication is a basic operation in linear algebra and an essential building block for a wide range of algorithms in various scientific fields. Theory and implementation for the dense, square matrix case are well-developed. If matrices are sparse, with application-specific sparsity patterns, the optimal implementation remains an open question. Here, we explore the performance of communication reducing 2.5D algorithms and one-sided MPI communication in the context of linear scaling electronic structure theory. In particular, we extend the DBCSR sparse matrix library, which is the basic building block for linear scaling electronic structure theory and low scaling correlated methods in CP2K. The library is specifically designed to efficiently perform block-sparse matrix-matrix multiplication of matrices with a relatively large occupation. Here, we compare the performance of the original implementation based on Cannon's algorithm and MPI point-to-point communication, with an implementation based on MPI one-sided communications (RMA), in both a 2D and a 2.5D approach. The 2.5D approach trades memory and auxiliary operations for reduced communication, which can lead to a speedup if communication is dominant. The 2.5D algorithm is somewhat easier to implement with one-sided communications. A detailed description of the implementation is provided, also for non ideal processor topologies, since this is important for actual applications. Given the importance of the precise sparsity pattern, and even the actual matrix data, which decides the effective fill-in upon multiplication, the tests are performed within the CP2K package with application benchmarks. Results show a substantial boost in performance for the RMA based 2.5D algorithm, up to 1.80x, which is observed to increase with the number of involved processes in the parallelization.Comment: In Proceedings of PASC '17, Lugano, Switzerland, June 26-28, 2017, 10 pages, 4 figure

    ELSI: A Unified Software Interface for Kohn-Sham Electronic Structure Solvers

    Full text link
    Solving the electronic structure from a generalized or standard eigenproblem is often the bottleneck in large scale calculations based on Kohn-Sham density-functional theory. This problem must be addressed by essentially all current electronic structure codes, based on similar matrix expressions, and by high-performance computation. We here present a unified software interface, ELSI, to access different strategies that address the Kohn-Sham eigenvalue problem. Currently supported algorithms include the dense generalized eigensolver library ELPA, the orbital minimization method implemented in libOMM, and the pole expansion and selected inversion (PEXSI) approach with lower computational complexity for semilocal density functionals. The ELSI interface aims to simplify the implementation and optimal use of the different strategies, by offering (a) a unified software framework designed for the electronic structure solvers in Kohn-Sham density-functional theory; (b) reasonable default parameters for a chosen solver; (c) automatic conversion between input and internal working matrix formats, and in the future (d) recommendation of the optimal solver depending on the specific problem. Comparative benchmarks are shown for system sizes up to 11,520 atoms (172,800 basis functions) on distributed memory supercomputing architectures.Comment: 55 pages, 14 figures, 2 table

    An electronic model for self-assembled hybrid organic/perovskite semiconductors: reverse band edge electronic states ordering and spin-orbit coupling

    Full text link
    Based on density functional theory, the electronic and optical properties of hybrid organic/perovskite crystals are thoroughly investigated. We consider the mono-crystalline 4FPEPI as material model and demonstrate the optical process is governed by three active Bloch states at the {\Gamma} point of the reduced Brillouin zone with a reverse ordering compared to tetrahedrally bonded semiconductors. Giant spin-orbit coupling effects and optical activities are subsequently inferred from symmetry analysis.Comment: 17 pages, 6 figure
    corecore