Search CORE

65 research outputs found

Low Power Processor Architectures and Contemporary Techniques for Power Optimization – A Review

Author: Gujarathi Hemal S
McDonald-Maier Klaus D
Qadri Muhammad Yasir
Publication venue: 'Academy Publisher'
Publication date: 01/01/2009
Field of study

The technological evolution has increased the number of transistors for a given die area significantly and increased the switching speed from few MHz to GHz range. Such inversely proportional decline in size and boost in performance consequently demands shrinking of supply voltage and effective power dissipation in chips with millions of transistors. This has triggered substantial amount of research in power reduction techniques into almost every aspect of the chip and particularly the processor cores contained in the chip. This paper presents an overview of techniques for achieving the power efficiency mainly at the processor core level but also visits related domains such as buses and memories. There are various processor parameters and features such as supply voltage, clock frequency, cache and pipelining which can be optimized to reduce the power consumption of the processor. This paper discusses various ways in which these parameters can be optimized. Also, emerging power efficient processor architectures are overviewed and research activities are discussed which should help reader identify how these factors in a processor contribute to power consumption. Some of these concepts have been already established whereas others are still active research areas. © 2009 ACADEMY PUBLISHER

University of Essex Research Repository

CiteSeerX

Crossref

A survey of emerging architectural techniques for improving cache energy consumption

Author: Bhebhe Washington
Opoku Agyeman Michael
Publication venue: 'Foundation of Computer Science'
Publication date: 25/11/2016
Field of study

The search goes on for another ground breaking phenomenon to reduce the ever-increasing disparity between the CPU performance and storage. There are encouraging breakthroughs in enhancing CPU performance through fabrication technologies and changes in chip designs but not as much luck has been struck with regards to the computer storage resulting in material negative system performance. A lot of research effort has been put on finding techniques that can improve the energy efficiency of cache architectures. This work is a survey of energy saving techniques which are grouped on whether they save the dynamic energy, leakage energy or both. Needless to mention, the aim of this work is to compile a quick reference guide of energy saving techniques from 2013 to 2016 for engineers, researchers and students

University of Northampton's Research Explorer

NECTAR

A hardware mechanism to reduce the energy consumption of the register file of in-order architectures

Author: Ayala Rodrigo José Luis
López Barrio Carlos Alberto
López Vallejo Marisa
Veidenbaum Alexander
Publication venue: 'Inderscience Publishers'
Publication date: 01/01/2008
Field of study

This paper introduces an efficient hardware approach to reduce the register file energy consumption by turning unused registers into a low power state. Bypassing the register fields of the fetch instruction to the decode stage allows the identification of registers required by the current instruction (instruction predecode) and allows the control logic to turn them back on. They are put into the low-power state after the instruction use. This technique achieves an 85% energy reduction with no performance penalty

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Archivo Digital UPM

Impact of partitioning cache schemes on the cache hierarchy of SMT processors

Author: Kenyon Samantha
López Sonia
Sahuquillo Borrás Julio
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 24/08/2015
Field of study

© 2015 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Power consumption is becoming an increasingly important component of processor design. As technology node shrinks both static and dynamic power become more relevant. This is particularly critical for the cache hierarchy. Previous implementations mainly focus on reducing only one kind of power in the cache, either static or dynamic. However, for a more robust approach that will remain relevant as technology continues to shrink, both aspects of power need to be addressed. Recent processors, e.g. Intel Core or IBM Power8, implement simultaneous multithreading (SMT) cores to hide high memory latencies. In these systems, the dynamic energy in the L1 cache is even more stressed since this cache level is shared by several threads running on the same core. This paper proposes and evaluates the use of phase adaptive caches in all structures of a 3-level cache hierarchy of a SMT cores. Compared to the use of conventional caches, our work results on significant dynamic and leakage energy savings with scarce performance impact.This work was supported by the Spanish Ministerio de Economía y Competitividad (MINECO) and by FEDER funds under Grant TIN2012–38341–C04–01.Kenyon, S.; López, S.; Sahuquillo Borrás, J. (2015). Impact of partitioning cache schemes on the cache hierarchy of SMT processors. IEEE. https://doi.org/10.1109/HPCC-CSS-ICESS.2015.127

Crossref

RiuNet

Software-Based Self-Test of Set-Associative Cache Memories

Author: Di Carlo Stefano
Prinetto Paolo Ernesto
Savino Alessandro
Publication venue: IEEE Computer Society
Publication date: 01/01/2011
Field of study

Embedded microprocessor cache memories suffer from limited observability and controllability creating problems during in-system tests. This paper presents a procedure to transform traditional march tests into software-based self-test programs for set-associative cache memories with LRU replacement. Among all the different cache blocks in a microprocessor, testing instruction caches represents a major challenge due to limitations in two areas: 1) test patterns which must be composed of valid instruction opcodes and 2) test result observability: the results can only be observed through the results of executed instructions. For these reasons, the proposed methodology will concentrate on the implementation of test programs for instruction caches. The main contribution of this work lies in the possibility of applying state-of-the-art memory test algorithms to embedded cache memories without introducing any hardware or performance overheads and guaranteeing the detection of typical faults arising in nanometer CMOS technologie

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Microarchitectural techniques to reduce energy consumption in the memory hierarchy

Author: Ghosh Mrinmoy
Publication venue: Georgia Institute of Technology
Publication date: 03/04/2009
Field of study

This thesis states that dynamic profiling of the memory reference stream can improve energy and performance in the memory hierarchy. The research presented in this theses provides multiple instances of using lightweight hardware structures to profile the memory reference stream. The objective of this research is to develop microarchitectural techniques to reduce energy consumption at different levels of the memory hierarchy. Several simple and implementable techniques were developed as a part of this research. One of the techniques identifies and eliminates redundant refresh operations in DRAM and reduces DRAM refresh power. Another, reduces leakage energy in L2 and higher level caches for multiprocessor systems. The emphasis of this research has been to develop several techniques of obtaining energy savings in caches using a simple hardware structure called the counting Bloom filter (CBF). CBFs have been used to predict L2 cache misses and obtain energy savings by not accessing the L2 cache on a predicted miss. A simple extension of this technique allows CBFs to do way-estimation of set associative caches to reduce energy in cache lookups. Another technique using CBFs track addresses in a Virtual Cache and reduce false synonym lookups. Finally this thesis presents a technique to reduce dynamic power consumption in level one caches using significance compression. The significant energy and performance improvements demonstrated by the techniques presented in this thesis suggest that this work will be of great value for designing memory hierarchies of future computing platforms.Ph.D.Committee Chair: Lee, Hsien-Hsin S.; Committee Member: Cahtterjee,Abhijit; Committee Member: Mukhopadhyay, Saibal; Committee Member: Pande, Santosh; Committee Member: Yalamanchili, Sudhaka

Scholarly Materials And Research @ Georgia Tech

A low-power cache system for high-performance processors

Author: Ye Jiongyao
Publication venue
Publication date: 01/01/2011
Field of study

制度:新 ; 報告番号:甲3439号 ; 学位の種類:博士(工学) ; 授与年月日:12-Sep-11 ; 早大学位記番号:新576

Waseda University Repository

Low Cost Dynamic Architecture Adaptation Schemes for Drowsy Cache Management

Author: C. Mani Krishna
Israel Koren
Nitin Prakash
Publication venue: 'American Scientific Publishers'
Publication date
Field of study

Crossref

Drowsy cache partitioning for reduced static and dynamic energy in the cache hierarchy

Author: Fitzgerald Brendan
Publication venue: RIT Scholar Works
Publication date: 01/06/2012
Field of study

Power consumption in computing today has lead the industry towards energy efficient computing. As transistor technology shrinks, new techniques have to be developed to keep leakage current, the dominant portion of overall power consumption, to a minimum. Due to the large amount of transistors devoted to the cache hierarchy, the cache provides an excellent avenue to dramatically reduce power usage. The inherent danger with techniques that save power can negatively effect the primary reason for the inclusion of the cache, performance. This thesis work proposes a modification to the cache hierarchy that dramatically saves power with only a slight reduction in performance. By taking advantage of the overwhelming preference of memory accesses to the most recently used blocks, these blocks are placed into a small, fast access A partition. The rest of the cache is put into a drowsy mode, a state preserving technique that reduces leakage power within the remaining portion of the cache. This design was implemented within a private, second level cache that achieved an average of almost 20% dynamic energy savings and an average of nearly 45% leakage energy savings. These savings were attained while incurring an average performance penalty of only 2%

RIT Scholar Works

Joint Hardware-Software Leakage Minimization Approach for the Register File of VLIW Embedded Architectures

Author: Atienza David
Ayala José L.
Catthoor Francky
De Micheli Giovanni
Lopez-Vallejo Marisa
Raghavan Praveen
Verkest Diederik
Publication venue: 'Elsevier BV'
Publication date: 06/11/2007
Field of study

New applications demand very high processing power when run on embedded systems. Very Long Instruction Word (VLIW) architectures have emerged as a promising alternative to provide such processing capabilities under the given energy budget. However, in this new VLIW-based architectures, the register file is a very critical contributor to the overall power consumption and new approaches have to be proposed to reduce its power while preserving system performance. In this paper, we propose a novel joint hardware–software approach that reduces the leakage energy in the register files of these embedded VLIW architectures. This approach relies upon an energy-aware register assignment method and a hardware support that creates sub-banks in the global register file that can be switched on/off at run time. Our results indicate energy savings in the register file, after considering the overhead of the added extra hardware, up to 50% for modern multimedia embedded applications without performance degradation. We illustrate this approach using real-life applications running on these processors. We also illustrate the tradeoff between the area overhead vs. the gains in the leakage energy for the different strategies

Infoscience - École polytechnique fédérale de Lausanne