18,249 research outputs found
A Survey of Prediction and Classification Techniques in Multicore Processor Systems
In multicore processor systems, being able to accurately predict the future provides new optimization opportunities, which otherwise could not be exploited. For example, an oracle able to predict a certain application\u27s behavior running on a smart phone could direct the power manager to switch to appropriate dynamic voltage and frequency scaling modes that would guarantee minimum levels of desired performance while saving energy consumption and thereby prolonging battery life. Using predictions enables systems to become proactive rather than continue to operate in a reactive manner. This prediction-based proactive approach has become increasingly popular in the design and optimization of integrated circuits and of multicore processor systems. Prediction transforms from simple forecasting to sophisticated machine learning based prediction and classification that learns from existing data, employs data mining, and predicts future behavior. This can be exploited by novel optimization techniques that can span across all layers of the computing stack. In this survey paper, we present a discussion of the most popular techniques on prediction and classification in the general context of computing systems with emphasis on multicore processors. The paper is far from comprehensive, but, it will help the reader interested in employing prediction in optimization of multicore processor systems
Towards Energy-Proportional Computing for Enterprise-Class Server Workloads
Massive data centers housing thousands of computing nodes
have become commonplace in enterprise computing, and the
power consumption of such data centers is growing at an
unprecedented rate. Adding to the problem is the inability
of the servers to exhibit energy proportionality, i.e., provide
energy-ecient execution under all levels of utilization,
which diminishes the overall energy eciency of the data
center. It is imperative that we realize eective strategies
to control the power consumption of the server and improve
the energy eciency of data centers. With the advent of
Intel Sandy Bridge processors, we have the ability to specify
a limit on power consumption during runtime, which creates
opportunities to design new power-management techniques
for enterprise workloads and make the systems that they run
on more energy-proportional.
In this paper, we investigate whether it is possible to achieve
energy proportionality for an enterprise-class server workload,
namely SPECpower ssj2008 benchmark, by using Intel's
Running Average Power Limit (RAPL) interfaces. First,
we analyze the power consumption and characterize the instantaneous
power prole of the SPECpower benchmark at
a subsystem-level using the on-chip energy meters exposed
via the RAPL interfaces. We then analyze the impact of
RAPL power limiting on the performance, per-transaction
response time, power consumption, and energy eciency of
the benchmark under dierent load levels. Our observations
and results shed light on the ecacy of the RAPL interfaces
and provide guidance for designing power-management techniques
for enterprise-class workloads
Lost in translation: Exposing hidden compiler optimization opportunities
Existing iterative compilation and machine-learning-based optimization
techniques have been proven very successful in achieving better optimizations
than the standard optimization levels of a compiler. However, they were not
engineered to support the tuning of a compiler's optimizer as part of the
compiler's daily development cycle. In this paper, we first establish the
required properties which a technique must exhibit to enable such tuning. We
then introduce an enhancement to the classic nightly routine testing of
compilers which exhibits all the required properties, and thus, is capable of
driving the improvement and tuning of the compiler's common optimizer. This is
achieved by leveraging resource usage and compilation information collected
while systematically exploiting prefixes of the transformations applied at
standard optimization levels. Experimental evaluation using the LLVM v6.0.1
compiler demonstrated that the new approach was able to reveal hidden
cross-architecture and architecture-dependent potential optimizations on two
popular processors: the Intel i5-6300U and the Arm Cortex-A53-based Broadcom
BCM2837 used in the Raspberry Pi 3B+. As a case study, we demonstrate how the
insights from our approach enabled us to identify and remove a significant
shortcoming of the CFG simplification pass of the LLVM v6.0.1 compiler.Comment: 31 pages, 7 figures, 2 table. arXiv admin note: text overlap with
arXiv:1802.0984
- β¦