6 research outputs found
Analisi e progettazione di tecniche di gestione per cache L2 Way-Adaptable
Analisi e progettazione di tecniche di gestione per cache L2 Way-Adaptabl
高エネルギ効率マイクロプロセッサのためのハードウェア・ソフトウェア協調型キャッシュメモリシステムに関する研究
Tohoku University小林広明課
Way-Adaptable D-Nuca Caches
Abstract: Non-uniform cache architecture (NUCA) aims to limit the wire-delay problem typical
of large on-chip last level caches: by partitioning a large cache into several banks, with the
latency of each one depending on its physical location and by employing a scalable on-chip
network to interconnect the banks with the cache controller, the average access latency can be
reduced with respect to a traditional cache. The addition of a migration mechanism to move the
most frequently accessed data towards the cache controller (D-NUCA) further improves the
average access latency.
In this work we propose a last-level cache design, based on the D-NUCA scheme, which
is able to significantly limit its static power consumption by dynamically adapting to the needs of
the running application: the way adaptable D-NUCA cache. This design leads to a fast and
power-efficient memory hierarchy with an average reduction by 31.2% in energy-delay product
(EDP) with respect to a traditional D-NUCA. We propose and discuss a methodology for tuning
the intrinsic parameters of our design and investigate the adoption of the way adaptable D-NUCA
scheme as a shared L2 cache in a chip multiprocessor (CMP) system (24% reduction of EDP)
High-Performance and Low-Power Magnetic Material Memory Based Cache Design
Magnetic memory technologies are very promising candidates to be universal memory due to its good scalability, zero standby power and radiation hardness. Having a cell area much smaller than SRAM, magnetic memory can be used to construct much larger cache with the same die footprint, leading to siginficant improvement of overall system performance and power consumption especially in this multi-core era. However, magnetic memories have their own drawbacks such as slow write, read disturbance and scaling limitation, making its usage as caches challenging.
This dissertation comprehensively studied these two most popular magnetic memory technologies. Design exploration and optimization for the cache design from different design
layers including the memory devices, peripheral circuit, memory array structure and micro-architecture are presented. By leveraging device features, two major micro-architectures -multi-retention cache hierarchy and process-variation-aware cache are presented to improve the write performance of STT-RAM. The enhancement in write performance results in the
degradation of read operations, in terms of both speed and data reliability. This dissertation also presents an architecture to resolve STT-RAM read disturbance issue. Furthermore, the scaling of STT-RAM is hindered due to the required size of switching transistor. To break the cell area limitation of STT-RAM, racetrack memory is studied to achieve an even higher memory density and better performance and lower energy consumption. With dedicated elaboration, racetrack memory based cache design can achieve a siginificant area reduction and energy saving when compared to optimized STT-RAM