85 research outputs found
On the Efficient Evaluation of the Exchange Correlation Potential on Graphics Processing Unit Clusters
The predominance of Kohn-Sham density functional theory (KS-DFT) for the
theoretical treatment of large experimentally relevant systems in molecular
chemistry and materials science relies primarily on the existence of efficient
software implementations which are capable of leveraging the latest advances in
modern high performance computing (HPC). With recent trends in HPC leading
towards in increasing reliance on heterogeneous accelerator based architectures
such as graphics processing units (GPU), existing code bases must embrace these
architectural advances to maintain the high-levels of performance which have
come to be expected for these methods. In this work, we purpose a three-level
parallelism scheme for the distributed numerical integration of the
exchange-correlation (XC) potential in the Gaussian basis set discretization of
the Kohn-Sham equations on large computing clusters consisting of multiple GPUs
per compute node. In addition, we purpose and demonstrate the efficacy of the
use of batched kernels, including batched level-3 BLAS operations, in achieving
high-levels of performance on the GPU. We demonstrate the performance and
scalability of the implementation of the purposed method in the NWChemEx
software package by comparing to the existing scalable CPU XC integration in
NWChem.Comment: 26 pages, 9 figure
Quantum ESPRESSO: a modular and open-source software project for quantum simulations of materials
Quantum ESPRESSO is an integrated suite of computer codes for
electronic-structure calculations and materials modeling, based on
density-functional theory, plane waves, and pseudopotentials (norm-conserving,
ultrasoft, and projector-augmented wave). Quantum ESPRESSO stands for "opEn
Source Package for Research in Electronic Structure, Simulation, and
Optimization". It is freely available to researchers around the world under the
terms of the GNU General Public License. Quantum ESPRESSO builds upon
newly-restructured electronic-structure codes that have been developed and
tested by some of the original authors of novel electronic-structure algorithms
and applied in the last twenty years by some of the leading materials modeling
groups worldwide. Innovation and efficiency are still its main focus, with
special attention paid to massively-parallel architectures, and a great effort
being devoted to user friendliness. Quantum ESPRESSO is evolving towards a
distribution of independent and inter-operable codes in the spirit of an
open-source project, where researchers active in the field of
electronic-structure calculations are encouraged to participate in the project
by contributing their own codes or by implementing their own ideas into
existing codes.Comment: 36 pages, 5 figures, resubmitted to J.Phys.: Condens. Matte
Development of high performance scientific components for interoperability of computing packages
Three major high performance quantum chemistry computational packages, NWChem, GAMESS and MPQC have been developed by different research efforts following different design patterns. The goal is to achieve interoperability among these packages by overcoming the challenges caused by the different communication patterns and software design of each of these packages. A chemistry algorithm is hard to develop as well as being a time consuming process; integration of large quantum chemistry packages will allow resource sharing and thus avoid reinvention of the wheel. Creating connections between these incompatible packages is the major motivation of the proposed work. This interoperability is achieved by bringing the benefits of Component Based Software Engineering through a plug-and-play component framework called Common Component Architecture (CCA). In this thesis, I present a strategy and process used for interfacing two widely used and important computational chemistry methodologies: Quantum Mechanics and Molecular Mechanics. To show the feasibility of the proposed approach the Tuning and Analysis Utility (TAU) has been coupled with NWChem code and its CCA components Results show that the overhead is negligible when compared to the ease and potential of organizing and coping with large-scale software applications
Barrier elision for production parallel programs
Large scientific code bases are often composed of several layers of runtime libraries, implemented in multiple programming languages. In such situation, programmers often choose conservative synchronization patterns leading to suboptimal performance. In this paper, we present context-sensitive dynamic optimizations that elide barriers redundant during the program execution. In our technique, we perform data race detection alongside the program to identify redundant barriers in their calling contexts; after an initial learning, we start eliding all future instances of barriers occurring in the same calling context. We present an automatic on-the-fly optimization and a multi-pass guided optimization. We apply our techniques to NWChem - a 6 million line computational chemistry code written in C/C++/Fortran that uses several runtime libraries such as Global Arrays, ComEx, DMAPP, and MPI. Our technique elides a surprisingly high fraction of barriers (as many as 63%) in production runs. This redundancy elimination translates to application speedups as high as 14% on 2048 cores. Our techniques also provided valuable insight about the application behavior, later used by NWChem developers. Overall, we demonstrate the value of holistic context-sensitive analyses that consider the domain science in conjunction with the associated runtime software stack
Recommended from our members
Computational Studies in Molecular Geochemistry and Biogeochemistry
The ability to predict the transport and transformations of contaminants within the subsurface is critical for decisions on virtually every waste disposal option facing the Department of Energy (DOE), from remediation technologies such as in situ bioremediation to evaluations of the safety of nuclear waste repositories. With this fact in mind, the DOE has recently sponsored a series of workshops on the development of a Strategic Simulation Plan on applications of high perform-ance computing to national problems of significance to the DOE. One of the areas selected for application was in the area of subsurface transport and environmental chemistry. Within the SSP on subsurface transport and environmental chemistry several areas were identified where applications of high performance computing could potentially significantly advance our knowledge of contaminant fate and transport. Within each of these areas molecular level simulations were specifically identified as a key capability necessary for the development of a fundamental mechanistic understanding of complex biogeochemical processes. This effort consists of a series of specific molecular level simulations and program development in four key areas of geochemistry/biogeochemistry (i.e., aqueous hydrolysis, redox chemistry, mineral surface interactions, and microbial surface properties). By addressing these four differ-ent, but computationally related, areas it becomes possible to assemble a team of investigators with the necessary expertise in high performance computing, molecular simulation, and geochemistry/biogeochemistry to make significant progress in each area. The specific targeted geochemical/biogeochemical issues include: Microbial surface mediated processes: the effects of lipopolysacchardies present on gram-negative bacteria. Environmental redox chemistry: Dechlorination pathways of carbon tetrachloride and other polychlorinated compounds in the subsurface. Mineral surface interactions: Describing surfaces at multiple scales with realistic surface functional groups Aqueous Hydrolysis Reactions and Solvation of Highly Charged Species: Understanding the formation of polymerized species and ore formation under extreme (Hanford Vadose Zone and geothermo) conditions. By understanding on a fundamental basis these key issues, it is anticipated that the impacts of this research will be extendable to a wide range of biogeochemical issues. Taken in total such an effort truly represents a “Grand Challenge” in molecular geochemistry and biogeochemistry
Exploiting variability for energy optimization of parallel programs
In this paper we present optimizations that use DVFS mechanisms to reduce the total energy usage in scientific applications. Our main insight is that noise is intrinsic to large scale parallel executions and it appears whenever shared resources are contended. The presence of noise allows us to identify and manipulate any program regions amenable to DVFS. When compared to previous energy optimizations that make per core decisions using predictions of the running time, our scheme uses a qualitative approach to recognize the signature of executions amenable to DVFS. By recognizing the "shape of variability" we can optimize codes with highly dynamic behavior, which pose challenges to all existing DVFS techniques. We validate our approach using offline and online analyses for one-sided and two-sided communication paradigms. We have applied our methods to NWChem, and we show best case improvements in energy use of 12% at no loss in performance when using online optimizations running on 720 Haswell cores with one-sided communication. With NWChem on MPI two-sided and offline analysis, capturing the initialization, we find energy savings of up to 20%, with less than 1% performance cost
- …