5,839 research outputs found
Power efficient job scheduling by predicting the impact of processor manufacturing variability
Modern CPUs suffer from performance and power consumption variability due to the manufacturing process. As a result, systems that do not consider such variability caused by manufacturing issues lead to performance degradations and wasted power. In order to avoid such negative impact, users and system administrators must actively counteract any manufacturing variability.
In this work we show that parallel systems benefit from taking into account the consequences of manufacturing variability when making scheduling decisions at the job scheduler level. We also show that it is possible to predict the impact of this variability on specific applications by using variability-aware power prediction models. Based on these power models, we propose two job scheduling policies that consider the effects of manufacturing variability for each application and that ensure that power consumption stays under a system-wide power budget. We evaluate our policies under different power budgets and traffic scenarios, consisting of both single- and multi-node parallel applications, utilizing up to 4096 cores in total. We demonstrate that they decrease job turnaround time, compared to contemporary scheduling policies used on production clusters, up to 31% while saving up to 5.5% energy.Postprint (author's final draft
Limits on Fundamental Limits to Computation
An indispensable part of our lives, computing has also become essential to
industries and governments. Steady improvements in computer hardware have been
supported by periodic doubling of transistor densities in integrated circuits
over the last fifty years. Such Moore scaling now requires increasingly heroic
efforts, stimulating research in alternative hardware and stirring controversy.
To help evaluate emerging technologies and enrich our understanding of
integrated-circuit scaling, we review fundamental limits to computation: in
manufacturing, energy, physical space, design and verification effort, and
algorithms. To outline what is achievable in principle and in practice, we
recall how some limits were circumvented, compare loose and tight limits. We
also point out that engineering difficulties encountered by emerging
technologies may indicate yet-unknown limits.Comment: 15 pages, 4 figures, 1 tabl
On the accuracy and usefulness of analytic energy models for contemporary multicore processors
This paper presents refinements to the execution-cache-memory performance
model and a previously published power model for multicore processors. The
combination of both enables a very accurate prediction of performance and
energy consumption of contemporary multicore processors as a function of
relevant parameters such as number of active cores as well as core and Uncore
frequencies. Model validation is performed on the Sandy Bridge-EP and
Broadwell-EP microarchitectures. Production-related variations in chip quality
are demonstrated through a statistical analysis of the fit parameters obtained
on one hundred Broadwell-EP CPUs of the same model. Insights from the models
are used to explain the performance- and energy-related behavior of the
processors for scalable as well as saturating (i.e., memory-bound) codes. In
the process we demonstrate the models' capability to identify optimal operating
points with respect to highest performance, lowest energy-to-solution, and
lowest energy-delay product and identify a set of best practices for
energy-efficient execution
Empirical characterization and modeling of power consumption and energy aware scheduling in data centers
Energy-efficient management is key in modern data centers in order to reduce
operational cost and environmental contamination. Energy management
and renewable energy utilization are strategies to optimize energy consumption
in high-performance computing. In any case, understanding the power consumption
behavior of physical servers in datacenter is fundamental to implement
energy-aware policies effectively. These policies should deal with possible
performance degradation of applications to ensure quality of service.
This thesis presents an empirical evaluation of power consumption for scientific
computing applications in multicore systems. Three types of applications
are studied, in single and combined executions on Intel and AMD servers, for
evaluating the overall power consumption of each application. The main results
indicate that power consumption behavior has a strong dependency with
the type of application. Additional performance analysis shows that the best
load of the server regarding energy efficiency depends on the type of the applications,
with efficiency decreasing in heavily loaded situations. These results
allow formulating models to characterize applications according to power consumption,
efficiency, and resource sharing, which provide useful information
for resource management and scheduling policies. Several scheduling strategies
are evaluated using the proposed energy model over realistic scientific computing
workloads. Results confirm that strategies that maximize host utilization
provide the best energy efficiency.Agencia Nacional de Investigación e Innovación FSE_1_2017_1_14478
Approximate Inference for Constructing Astronomical Catalogs from Images
We present a new, fully generative model for constructing astronomical
catalogs from optical telescope image sets. Each pixel intensity is treated as
a random variable with parameters that depend on the latent properties of stars
and galaxies. These latent properties are themselves modeled as random. We
compare two procedures for posterior inference. One procedure is based on
Markov chain Monte Carlo (MCMC) while the other is based on variational
inference (VI). The MCMC procedure excels at quantifying uncertainty, while the
VI procedure is 1000 times faster. On a supercomputer, the VI procedure
efficiently uses 665,000 CPU cores to construct an astronomical catalog from 50
terabytes of images in 14.6 minutes, demonstrating the scaling characteristics
necessary to construct catalogs for upcoming astronomical surveys.Comment: accepted to the Annals of Applied Statistic
A Comparison of Parallel Graph Processing Implementations
The rapidly growing number of large network analysis problems has led to the
emergence of many parallel and distributed graph processing systems---one
survey in 2014 identified over 80. Since then, the landscape has evolved; some
packages have become inactive while more are being developed. Determining the
best approach for a given problem is infeasible for most developers. To enable
easy, rigorous, and repeatable comparison of the capabilities of such systems,
we present an approach and associated software for analyzing the performance
and scalability of parallel, open-source graph libraries. We demonstrate our
approach on five graph processing packages: GraphMat, the Graph500, the Graph
Algorithm Platform Benchmark Suite, GraphBIG, and PowerGraph using synthetic
and real-world datasets. We examine previously overlooked aspects of parallel
graph processing performance such as phases of execution and energy usage for
three algorithms: breadth first search, single source shortest paths, and
PageRank and compare our results to Graphalytics.Comment: 10 pages, 10 figures, Submitted to EuroPar 2017 and rejected. Revised
and submitted to IEEE Cluster 201
- …