46,419 research outputs found
Computing server power modeling in a data center: survey,taxonomy and performance evaluation
Data centers are large scale, energy-hungry infrastructure serving the
increasing computational demands as the world is becoming more connected in
smart cities. The emergence of advanced technologies such as cloud-based
services, internet of things (IoT) and big data analytics has augmented the
growth of global data centers, leading to high energy consumption. This upsurge
in energy consumption of the data centers not only incurs the issue of surging
high cost (operational and maintenance) but also has an adverse effect on the
environment. Dynamic power management in a data center environment requires the
cognizance of the correlation between the system and hardware level performance
counters and the power consumption. Power consumption modeling exhibits this
correlation and is crucial in designing energy-efficient optimization
strategies based on resource utilization. Several works in power modeling are
proposed and used in the literature. However, these power models have been
evaluated using different benchmarking applications, power measurement
techniques and error calculation formula on different machines. In this work,
we present a taxonomy and evaluation of 24 software-based power models using a
unified environment, benchmarking applications, power measurement technique and
error formula, with the aim of achieving an objective comparison. We use
different servers architectures to assess the impact of heterogeneity on the
models' comparison. The performance analysis of these models is elaborated in
the paper
A Holistic Approach to Log Data Analysis in High-Performance Computing Systems: The Case of IBM Blue Gene/Q
The complexity and cost of managing high-performance computing infrastructures are on the rise. Automating management and repair through predictive models to minimize human interventions is an attempt to increase system availability and contain these costs. Building predictive models that are accurate enough to be useful in automatic management cannot be based on restricted log data from subsystems but requires a holistic approach to data analysis from disparate sources. Here we provide a detailed multi-scale characterization study based on four datasets reporting power consumption, temperature, workload, and hardware/software events for an IBM Blue Gene/Q installation.We show that the system runs a rich parallel workload, with low correlation among its components in terms of temperature and power, but higher correlation in terms of events. As expected, power and temperature correlate strongly, while events display negative correlations with load and power. Power and workload show moderate correlations, and only at the scale of components. The aim of the study is a systematic, integrated characterization of the computing infrastructure and discovery of correlation sources and levels to serve as basis for future predictive modeling efforts
Iso-energy-efficiency: An approach to power-constrained parallel computation
Future large scale high performance supercomputer systems require high energy efficiency to achieve exaflops computational power and beyond. Despite the need to understand energy efficiency in high-performance systems, there are few techniques to evaluate energy efficiency at scale. In this paper, we propose a system-level iso-energy-efficiency model to analyze, evaluate and predict energy-performance of data intensive parallel applications with various execution patterns running on large scale power-aware clusters. Our analytical model can help users explore the effects of machine and application dependent characteristics on system energy efficiency and isolate efficient ways to scale system parameters (e.g. processor count, CPU power/frequency, workload size and network bandwidth) to balance energy use and performance. We derive our iso-energy-efficiency model and apply it to the NAS Parallel Benchmarks on two power-aware clusters. Our results indicate that the model accurately predicts total system energy consumption within 5% error on average for parallel applications with various execution and communication patterns. We demonstrate effective use of the model for various application contexts and in scalability decision-making
- …