6,934 research outputs found
Dynamic cache reconfiguration based techniques for improving cache energy efficiency
Modern multicore processors are employing large last-level caches, for
example Intel's E7-8800 processor uses 24MB L3 cache. Further, with each CMOS
technology generation, leakage energy has been dramatically increasing and
hence, leakage energy is expected to become a major source of energy
dissipation, especially in last-level caches (LLCs). The conventional schemes
of cache energy saving either aim at saving dynamic energy or are based on
properties specific to first-level caches, and thus these schemes have limited
utility for last-level caches. Further, several other techniques require
offline profiling or per-application tuning and hence are not suitable for
product systems. In this research, we propose novel cache leakage energy saving
schemes for single-core and multicore systems; desktop, QoS, real-time and
server systems. We propose software-controlled, hardware-assisted techniques
which use dynamic cache reconfiguration to configure the cache to the most
energy efficient configuration while keeping the performance loss bounded. To
profile and test a large number of potential configurations, we utilize
low-overhead, micro-architecture components, which can be easily integrated
into modern processor chips. We adopt a system-wide approach to save energy to
ensure that cache reconfiguration does not increase energy consumption of other
components of the processor. We have compared our techniques with the
state-of-art techniques and have found that our techniques outperform them in
their energy efficiency. This research has important applications in improving
energy-efficiency of higher-end embedded, desktop, server processors and
multitasking systems. We have also proposed performance estimation approach for
efficient design space exploration and have implemented time-sampling based
simulation acceleration approach for full-system architectural simulators.Comment: PhD thesis, dynamic cache reconfiguratio
A review of High Performance Computing foundations for scientists
The increase of existing computational capabilities has made simulation
emerge as a third discipline of Science, lying midway between experimental and
purely theoretical branches [1, 2]. Simulation enables the evaluation of
quantities which otherwise would not be accessible, helps to improve
experiments and provides new insights on systems which are analysed [3-6].
Knowing the fundamentals of computation can be very useful for scientists, for
it can help them to improve the performance of their theoretical models and
simulations. This review includes some technical essentials that can be useful
to this end, and it is devised as a complement for researchers whose education
is focused on scientific issues and not on technological respects. In this
document we attempt to discuss the fundamentals of High Performance Computing
(HPC) [7] in a way which is easy to understand without much previous
background. We sketch the way standard computers and supercomputers work, as
well as discuss distributed computing and discuss essential aspects to take
into account when running scientific calculations in computers.Comment: 33 page
PowerPack: Energy Profiling and Analysis of High-Performance Systems and Applications
Energy efficiency is a major concern in modern high-performance computing system design. In the past few years, there has been mounting evidence that power usage limits system scale and computing density, and thus, ultimately system performance. However, despite the impact of power and energy on the computer systems community, few studies provide insight to where and how power is consumed on high-performance systems and applications. In previous work, we designed a framework called PowerPack that was the first tool to isolate the power consumption of devices including disks, memory, NICs, and processors in a high-performance cluster and correlate these measurements to application functions. In this work, we extend our framework to support systems with multicore, multiprocessor-based nodes, and then provide in-depth analyses of the energy consumption of parallel applications on clusters of these systems. These analyses include the impacts of chip multiprocessing on power and energy efficiency, and its interaction with application executions. In addition, we use PowerPack to study the power dynamics and energy efficiencies of dynamic voltage and frequency scaling (DVFS) techniques on clusters. Our experiments reveal conclusively how intelligent DVFS scheduling can enhance system energy efficiency while maintaining performance
Exploring performance and power properties of modern multicore chips via simple machine models
Modern multicore chips show complex behavior with respect to performance and
power. Starting with the Intel Sandy Bridge processor, it has become possible
to directly measure the power dissipation of a CPU chip and correlate this data
with the performance properties of the running code. Going beyond a simple
bottleneck analysis, we employ the recently published Execution-Cache-Memory
(ECM) model to describe the single- and multi-core performance of streaming
kernels. The model refines the well-known roofline model, since it can predict
the scaling and the saturation behavior of bandwidth-limited loop kernels on a
multicore chip. The saturation point is especially relevant for considerations
of energy consumption. From power dissipation measurements of benchmark
programs with vastly different requirements to the hardware, we derive a
simple, phenomenological power model for the Sandy Bridge processor. Together
with the ECM model, we are able to explain many peculiarities in the
performance and power behavior of multicore processors, and derive guidelines
for energy-efficient execution of parallel programs. Finally, we show that the
ECM and power models can be successfully used to describe the scaling and power
behavior of a lattice-Boltzmann flow solver code.Comment: 23 pages, 10 figures. Typos corrected, DOI adde
High-dimensional decoy-state quantum key distribution over 0.3 km of multicore telecommunication optical fibers
Multiplexing is a strategy to augment the transmission capacity of a
communication system. It consists of combining multiple signals over the same
data channel and it has been very successful in classical communications.
However, the use of enhanced channels has only reached limited practicality in
quantum communications (QC) as it requires the complex manipulation of quantum
systems of higher dimensions. Considerable effort is being made towards QC
using high-dimensional quantum systems encoded into the transverse momentum of
single photons but, so far, no approach has been proven to be fully compatible
with the existing telecommunication infrastructure. Here, we overcome such a
technological challenge and demonstrate a stable and secure high-dimensional
decoy-state quantum key distribution session over a 0.3 km long multicore
optical fiber. The high-dimensional quantum states are defined in terms of the
multiple core modes available for the photon transmission over the fiber, and
the decoy-state analysis demonstrates that our technique enables a positive
secret key generation rate up to 25 km of fiber propagation. Finally, we show
how our results build up towards a high-dimensional quantum network composed of
free-space and fiber based linksComment: Please see the complementary work arXiv:1610.01812 (2016
A Survey of Techniques For Improving Energy Efficiency in Embedded Computing Systems
Recent technological advances have greatly improved the performance and
features of embedded systems. With the number of just mobile devices now
reaching nearly equal to the population of earth, embedded systems have truly
become ubiquitous. These trends, however, have also made the task of managing
their power consumption extremely challenging. In recent years, several
techniques have been proposed to address this issue. In this paper, we survey
the techniques for managing power consumption of embedded systems. We discuss
the need of power management and provide a classification of the techniques on
several important parameters to highlight their similarities and differences.
This paper is intended to help the researchers and application-developers in
gaining insights into the working of power management techniques and designing
even more efficient high-performance embedded systems of tomorrow
- …