Optimizing GPAW

Abstract

GPAW is a versatile software package for first-principles simulations of nanostructures utilizing density-functional theory and time-dependent density-functional theory. Even though GPAW is already used for massively parallel calculations in several supercomputer systems, some performance bottlenecks still exist. First, the implementation based on the Python programming language introduces an I/O bottleneck during initialization which becomes serious when using thousands of CPU cores. Second, the current linear response time-dependent density-functional theory implementation contains a large matrix, which is replicated on all CPUs. When reaching for larger and larger systems, memory runs out due to the replication. In this report, we discuss the work done on resolving these bottlenecks. In addition, we have also worked on optimization aspects that are directed more to the future usage. As the number of cores in multicore CPUs is still increasing, an hybrid parallelization combining shared memory and distributed memory parallelization is becoming appealing. We have experimented with hybrid OpenMP/MPI and report here the initial results. GPAW also performs large dense matrix diagonalizations with the ScaLAPACK library. Due to limitations in ScaLAPACK these diagonalizations are expected to become a bottleneck in the future, which has led us to investigate alternatives for the ScaLAPACK

    Similar works

    Full text

    thumbnail-image

    Available Versions

    Last time updated on 04/01/2018