6,942 research outputs found
Automatic Energy Saving Schemes for Parallel Applications
Although high-performance computing traditionally focuses on the efficient execution of large-scale applications, both energy and power have become critical concerns when approaching exascale.
Drastic increases in the power consumption of supercomputers affect significantly their operating costs and failure rates. In modern microprocessor architectures, equipped with dynamic voltage and
frequency scaling (DVFS) and CPU clock modulation (throttling),
the power consumption may be controlled in software. Additionally, network interconnect, such as Infiniband, may be exploited to
maximize energy savings while the application performance loss and frequency switching overheads must be carefully balanced.
This work first studies two important collective communication operations, all-to-all and allgather and proposes energy saving strategies on the per-call basis. Next, it targets point-to-point communications to group them into phases and apply frequency scaling to them to save energy by exploiting the architectural and communication stalls. Finally, it proposes an automatic runtime system which combines both collective and point-to-point communications into phases, and applies throttling to them apart from DVFS to maximize energy savings. The experimental results are presented for NAS parallel benchmark problems as well as for the realistic parallel electronic structure calculations performed by the widely used quantum chemistry package GAMESS. Close to the maximum energy savings were obtained with a substantially low performance loss on the given platform
Runtime Energy Savings Based on Machine Learning Models for Multicore Applications
To improve the power consumption of parallel applications at the runtime, modern processors provide frequency scaling and power limiting capabilities. In this work, a runtime strategy is proposed to maximize energy savings under a given performance degradation. Machine learning techniques were utilized to develop performance models which would provide accurate performance prediction with change in operating core-uncore frequency. Experiments, performed on a node (28 cores) of a modern computing platform showed significant energy savings of as much as 26% with performance degradation of as low as 5% under the proposed strategy compared with the execution in the unlimited power case
The 1999 Center for Simulation of Dynamic Response in Materials Annual Technical Report
Introduction:
This annual report describes research accomplishments for FY 99 of the Center
for Simulation of Dynamic Response of Materials. The Center is constructing a
virtual shock physics facility in which the full three dimensional response of a
variety of target materials can be computed for a wide range of compressive, ten-
sional, and shear loadings, including those produced by detonation of energetic
materials. The goals are to facilitate computation of a variety of experiments
in which strong shock and detonation waves are made to impinge on targets
consisting of various combinations of materials, compute the subsequent dy-
namic response of the target materials, and validate these computations against
experimental data
An efficient MPI/OpenMP parallelization of the Hartree-Fock method for the second generation of Intel Xeon Phi processor
Modern OpenMP threading techniques are used to convert the MPI-only
Hartree-Fock code in the GAMESS program to a hybrid MPI/OpenMP algorithm. Two
separate implementations that differ by the sharing or replication of key data
structures among threads are considered, density and Fock matrices. All
implementations are benchmarked on a super-computer of 3,000 Intel Xeon Phi
processors. With 64 cores per processor, scaling numbers are reported on up to
192,000 cores. The hybrid MPI/OpenMP implementation reduces the memory
footprint by approximately 200 times compared to the legacy code. The
MPI/OpenMP code was shown to run up to six times faster than the original for a
range of molecular system sizes.Comment: SC17 conference paper, 12 pages, 7 figure
SO(5) Theory of Antiferromagnetism and Superconductivity
Antiferromagnetism and superconductivity are both fundamental and common
states of matter. In many strongly correlated systems, including the high Tc
cuprates, the heavy fermion compounds and the organic superconductors, they
occur next to each other in the phase diagram and influence each other's
physical properties. The SO(5) theory unifies these two basic states of matter
by a symmetry principle and describes their rich phenomenology through a single
low energy effective model. In this paper, we review the framework of the SO(5)
theory, and its detailed comparison with numerical and experimental results.Comment: Review article. 81 page
Dynamic Energy Management for Chip Multi-processors under Performance Constraints
We introduce a novel algorithm for dynamic energy management (DEM) under performance constraints in chip multi-processors (CMPs). Using the novel concept of delayed instructions count, performance loss estimations are calculated at the end of each control period for each core. In addition, a Kalman filtering based approach is employed to predict workload in the next control period for which voltage-frequency pairs must be selected. This selection is done with a novel dynamic voltage and frequency scaling (DVFS) algorithm whose objective is to reduce energy consumption but without degrading performance beyond the user set threshold. Using our customized Sniper based CMP system simulation framework, we demonstrate the effectiveness of the proposed algorithm for a variety of benchmarks for 16 core and 64 core network-on-chip based CMP architectures. Simulation results show consistent energy savings across the board. We present our work as an investigation of the tradeoff between the achievable energy reduction via DVFS when predictions are done using the effective Kalman filter for different performance penalty thresholds
- …