Search CORE

20 research outputs found

Cache Size Selection for Performance, Energy and Reliability of Time-Constrained Systems

Author: Al-Hashimi Bashir
Cai Yuan
Ejlali Alireza
Reddy Sudhakar
Schmitz Marcus
Publication venue
Publication date: 01/01/2006
Field of study

Southampton (e-Prints Soton)

Fine-grain CAM-tag cache resizing using miss tags

Author: Krste Asanović
Michael Zhang
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2004
Field of study

Crossref

A hardware mechanism to reduce the energy consumption of the register file of in-order architectures

Author: Ayala Rodrigo José Luis
López Barrio Carlos Alberto
López Vallejo Marisa
Veidenbaum Alexander
Publication venue: 'Inderscience Publishers'
Publication date: 01/01/2008
Field of study

This paper introduces an efficient hardware approach to reduce the register file energy consumption by turning unused registers into a low power state. Bypassing the register fields of the fetch instruction to the decode stage allows the identification of registers required by the current instruction (instruction predecode) and allows the control logic to turn them back on. They are put into the low-power state after the instruction use. This technique achieves an 85% energy reduction with no performance penalty

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Archivo Digital UPM

Energy Wall for Exascale Supercomputing

Author: Chen Juan
Dong Yong
Tang Yuhua
Wang Zhiyuan
Xue Jingling
Zhou Yun
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 07/02/2017
Field of study

"Sustainable development" is one of the major issues in the 21st century. Thus the notions of green computing, green development and so on show up one after another. As the large-scale parallel computing systems develop rapidly, energy consumption of such systems is becoming very huge, especially system performance reaches Petascale (10^15 Flops) or even Exascale (10^18 Flops). The huge energy consumption increases the system temperature, which seriously undermines the stability and reliability, and limits the growth of system size. The effects of energy consumption on scalability become a growing concern. Against the background, this paper proposes the concept of "Energy Wall" to highlight the significance of achieving scalable performance in peta/exascale supercomputing by taking energy consumption into account. We quantify the effect of energy consumption on scalability by building the energy-efficiency speedup model, which integrates computing performance and system energy. We define the energy wall quantitatively, and provide the theorem on the existence of the energy wall, and categorize the large-scale parallel computers according to the energy consumption. In the context of several representative types of HPC applications, we analyze and extrapolate the existence of the energy wall considering three kinds of topologies, 3D-Torus, binary n-cube and Fat tree which provides insights on how to mitigate the energy wall effect in system design and through hardware/software optimization in peta/exascale supercomputing

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)

LowLEAC: Low leakage energy architecture for caches

Author: Girmal Rashmi Parisa
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2017
Field of study

With the ever-decreasing feature sizes, static power dissipation has become a concern in computing devices. On-chip memories are a major contributor towards the processor’s leakage power dissipation due to their large transistor count. We propose a Low Leakage Energy Architecture for Caches, called LowLEAC to minimize the static power dissipation in caches made of CMOS SRAM cells. This technique is based on keeping only k most recently used cache lines powered on other lines powered off to reduce the leakage power dissipation. The control however increases the dynamic power due to re-fetching of data. To overcome that, we deploy CMOS compatible nonvolatile SRAM cell, called cNVSRAM, to implement caches. The cNVSRAM cell works as a conventional SRAM in the regular mode and saves the data in a non-volatile back up when a cache line is turned off or put in the sleep mode. The non-volatile back up mode helps improve the dependability of the cache and avoids the penalty occurred due to loss of data from the inactive cache lines. With a small area penalty, LowLEAC achieves 18% energy savings with insignificant impact on the performance. LowLEAC is a suitable architecture for cache memory in mobile computing devices to minimize battery power consumption and reduce heat

Digital Repository @ Iowa State University (ISU)

Energy-Aware Compilation and Hardware Design for VLIW Embedded Systems

Author: Atienza David
Ayala Jose L.
Catthoor Francky
Lopez-Vallejo Marisa
Raghavan Praveen
Verkest Diederik
Publication venue: 'Inderscience Publishers'
Publication date: 03/01/2009
Field of study

Tomorrow's embedded devices need to run multimedia applications demanding high computational power with low energy consumption constraints. In this context, the register file is a key source of power consumption and its inappropriate design and management severely affects system power. In this paper, we present a new approach to reduce the energy of shared register files in forthcoming embedded VLIW processors running real-life applications up to 60% without performance penalty. This approach relies on limited hardware extensions and a compiler-based energy-aware register assignment algorithm to deactivate at run-time parts of the register file (i.e., sub-banks) in an independent way

Infoscience - École polytechnique fédérale de Lausanne

Joint Hardware-Software Leakage Minimization Approach for the Register File of VLIW Embedded Architectures

Author: Atienza David
Ayala José L.
Catthoor Francky
De Micheli Giovanni
Lopez-Vallejo Marisa
Raghavan Praveen
Verkest Diederik
Publication venue: 'Elsevier BV'
Publication date: 06/11/2007
Field of study

New applications demand very high processing power when run on embedded systems. Very Long Instruction Word (VLIW) architectures have emerged as a promising alternative to provide such processing capabilities under the given energy budget. However, in this new VLIW-based architectures, the register file is a very critical contributor to the overall power consumption and new approaches have to be proposed to reduce its power while preserving system performance. In this paper, we propose a novel joint hardware–software approach that reduces the leakage energy in the register files of these embedded VLIW architectures. This approach relies upon an energy-aware register assignment method and a hardware support that creates sub-banks in the global register file that can be switched on/off at run time. Our results indicate energy savings in the register file, after considering the overhead of the added extra hardware, up to 50% for modern multimedia embedded applications without performance degradation. We illustrate this approach using real-life applications running on these processors. We also illustrate the tradeoff between the area overhead vs. the gains in the leakage energy for the different strategies

Infoscience - École polytechnique fédérale de Lausanne

CACHES D-NUCA TRIANGOLARI:MODELLAZIONE E VALUTAZIONE DEL CONSUMO DI POTENZA DINAMICO

Author: Taglione Emiliano
Publication venue: 'Pisa University Press'
Publication date: 24/10/2004
Field of study

In questo lavoro di tesi sarà valutato il consumo di potenza dinamico per le architetture di cache ad accesso non uniforme (Dinamic NUCA). A tal fine è stata modellata un’ architettura per le cache D-NUCA e Triangular D-NUCA; ricavati dei parametri per la stima del consumo dinamico; effettuate simulazioni per determinare il consumo dinamico della cache. I parametri di consumo sono stati ottenuti con il software per la modellazione di memorie cache CACTI. Le simulazioni sono state effettuate modificando il simulatore Sim-Alpha ( simulatore del processore ALPHA 21264) adattato all’utilizzo di cache TD-NUCA , i benchmark scelti: 176.gcc, 181.mcf, 256.bzip2, 300.twolf, sono appartenenti alla suite SpecInt 2000. I risultati ottenuti portano alla conclusione che le cache TD-NUCA Decrescenti sono risultate in assoluto le migliori sia in termini di consumo che di prestazioni su tre dei quattro benchmark; ne consegue che ulteriori affinamenti in questa tecnica progettuale sono auspicabili

Electronic Thesis and Dissertation Archive - Università di Pisa