122 research outputs found
HALLS: An Energy-Efficient Highly Adaptable Last Level STT-RAM Cache for Multicore Systems
Spin-Transfer Torque RAM (STT-RAM) is widely considered a promising
alternative to SRAM in the memory hierarchy due to STT-RAM's non-volatility,
low leakage power, high density, and fast read speed. The STT-RAM's small
feature size is particularly desirable for the last-level cache (LLC), which
typically consumes a large area of silicon die. However, long write latency and
high write energy still remain challenges of implementing STT-RAMs in the CPU
cache. An increasingly popular method for addressing this challenge involves
trading off the non-volatility for reduced write speed and write energy by
relaxing the STT-RAM's data retention time. However, in order to maximize
energy saving potential, the cache configurations, including STT-RAM's
retention time, must be dynamically adapted to executing applications' variable
memory needs. In this paper, we propose a highly adaptable last level STT-RAM
cache (HALLS) that allows the LLC configurations and retention time to be
adapted to applications' runtime execution requirements. We also propose
low-overhead runtime tuning algorithms to dynamically determine the best
(lowest energy) cache configurations and retention times for executing
applications. Compared to prior work, HALLS reduced the average energy
consumption by 60.57% in a quad-core system, while introducing marginal latency
overhead.Comment: To Appear on IEEE Transactions on Computers (TC
A Survey of Phase Classification Techniques for Characterizing Variable Application Behavior
Adaptable computing is an increasingly important paradigm that specializes
system resources to variable application requirements, environmental
conditions, or user requirements. Adapting computing resources to variable
application requirements (or application phases) is otherwise known as
phase-based optimization. Phase-based optimization takes advantage of
application phases, or execution intervals of an application, that behave
similarly, to enable effective and beneficial adaptability. In order for
phase-based optimization to be effective, the phases must first be classified
to determine when application phases begin and end, and ensure that system
resources are accurately specialized. In this paper, we present a survey of
phase classification techniques that have been proposed to exploit the
advantages of adaptable computing through phase-based optimization. We focus on
recent techniques and classify these techniques with respect to several factors
in order to highlight their similarities and differences. We divide the
techniques by their major defining characteristics---online/offline and
serial/parallel. In addition, we discuss other characteristics such as
prediction and detection techniques, the characteristics used for prediction,
interval type, etc. We also identify gaps in the state-of-the-art and discuss
future research directions to enable and fully exploit the benefits of
adaptable computing.Comment: To appear in IEEE Transactions on Parallel and Distributed Systems
(TPDS
Recommended from our members
MirrorCache: An Energy-Efficient Relaxed Retention L1 STTRAM Cache
Spin-Transfer Torque RAM (STTRAM) is a promising alternative to SRAMs in on-chip caches, due to several advantages, including non-volatility, low leakage, high integration density, and CMOS compatibility. However, STTRAMs' wide adoption in resource-constrained systems is impeded, in part, by high write energy and latency. A popular approach to mitigating these overheads involves relaxing the STTRAM's retention time, in order to reduce the write latency and energy. However, this approach usually requires a dynamic refresh scheme to maintain cache blocks' data integrity beyond the retention time, and typically requires an external refresh buffer. In this paper, we propose mirrorCache-an energy-efficient, buffer-free refresh scheme. MirrorCache leverages the STTRAM cell's compact feature size, and uses an auxiliary segment with the same size as the logical cache size to handle the refresh operations without the overheads of an external refresh buffer. Our experiments show that, compared to prior work, mirrorCache can reduce the average cache energy by at least 39.7% for a variety of systems.This item from the UA Faculty Publications collection is made available by the University of Arizona with support from the University of Arizona Libraries. If you have questions, please contact us at [email protected]
- …