169 research outputs found
Multicore Architecture-aware Scientific Applications
Modern high performance systems are becoming increasingly complex and powerful due to advancements in processor and memory architecture. In order to keep up with this increasing complexity, applications have to be augmented with certain capabilities to fully exploit such systems. These may be at the application level, such as static or dynamic adaptations or at the system level, like having strategies in place to override some of the default operating system polices, the main objective being to improve computational performance of the application. The current work proposes two such capabilites with respect to multi-threaded scientific applications, in particular a large scale physics application computing ab-initio nuclear structure. The first involves using a middleware tool to invoke dynamic adaptations in the application, so as to be able to adjust to the changing computational resource availability at run-time. The second involves a strategy for effective placement of data in main memory, to optimize memory access latencies and bandwidth. These capabilties when included were found to have a significant impact on the application performance, resulting in average speedups of as much as two to four times
Computational Physics on Graphics Processing Units
The use of graphics processing units for scientific computations is an
emerging strategy that can significantly speed up various different algorithms.
In this review, we discuss advances made in the field of computational physics,
focusing on classical molecular dynamics, and on quantum simulations for
electronic structure calculations using the density functional theory, wave
function techniques, and quantum field theory.Comment: Proceedings of the 11th International Conference, PARA 2012,
Helsinki, Finland, June 10-13, 201
Quantum ESPRESSO: a modular and open-source software project for quantum simulations of materials
Quantum ESPRESSO is an integrated suite of computer codes for
electronic-structure calculations and materials modeling, based on
density-functional theory, plane waves, and pseudopotentials (norm-conserving,
ultrasoft, and projector-augmented wave). Quantum ESPRESSO stands for "opEn
Source Package for Research in Electronic Structure, Simulation, and
Optimization". It is freely available to researchers around the world under the
terms of the GNU General Public License. Quantum ESPRESSO builds upon
newly-restructured electronic-structure codes that have been developed and
tested by some of the original authors of novel electronic-structure algorithms
and applied in the last twenty years by some of the leading materials modeling
groups worldwide. Innovation and efficiency are still its main focus, with
special attention paid to massively-parallel architectures, and a great effort
being devoted to user friendliness. Quantum ESPRESSO is evolving towards a
distribution of independent and inter-operable codes in the spirit of an
open-source project, where researchers active in the field of
electronic-structure calculations are encouraged to participate in the project
by contributing their own codes or by implementing their own ideas into
existing codes.Comment: 36 pages, 5 figures, resubmitted to J.Phys.: Condens. Matte
On the Efficient Evaluation of the Exchange Correlation Potential on Graphics Processing Unit Clusters
The predominance of Kohn-Sham density functional theory (KS-DFT) for the
theoretical treatment of large experimentally relevant systems in molecular
chemistry and materials science relies primarily on the existence of efficient
software implementations which are capable of leveraging the latest advances in
modern high performance computing (HPC). With recent trends in HPC leading
towards in increasing reliance on heterogeneous accelerator based architectures
such as graphics processing units (GPU), existing code bases must embrace these
architectural advances to maintain the high-levels of performance which have
come to be expected for these methods. In this work, we purpose a three-level
parallelism scheme for the distributed numerical integration of the
exchange-correlation (XC) potential in the Gaussian basis set discretization of
the Kohn-Sham equations on large computing clusters consisting of multiple GPUs
per compute node. In addition, we purpose and demonstrate the efficacy of the
use of batched kernels, including batched level-3 BLAS operations, in achieving
high-levels of performance on the GPU. We demonstrate the performance and
scalability of the implementation of the purposed method in the NWChemEx
software package by comparing to the existing scalable CPU XC integration in
NWChem.Comment: 26 pages, 9 figure
- …