Search CORE

12 research outputs found

Managing Static Leakage Energy in Microprocessor Functional Units

Author: Albonesi David H.
Dropsho Steven
Dwarkadas Sandhya
Friedman Eby G.
Kursun Volkan
Publication venue
Publication date: 30/11/2006
Field of study

Static energy due to subthreshold leakage current is projected to become a major component of the total energy in high performance microprocessors. Many studies so far have examined and proposed techniques to reduce leakage in on-chip storage structures. In this study, static energy is reduced in the integer functional units by leveraging the unique qualities of dual threshold voltage domino logic. Domino logic has desirable properties that greatly reduce leakage current while providing fast propagation times. However, due to the energy cost of entering the low leakage current state (sleep mode), domino logic has thus far been used only for leakage reduction in the longterm standby mode. We examine the utility of the sleep mode (while considering the aforementioned costs) when idle times are relatively short, one to a few hundred cycles, as is often the case for functional units. Using an analytical energy model suitable for architecture-level analysis, we explore the interaction of the application and technology, and the effect on energy and performance as the underlying parameters are varied, on a set of benchmarks. Our results show that if the leakage approaches the magnitude as projected in the literature, even for short idle intervals as few as ten cycles, an aggressive policy of activating the sleep mode at every idle period performs well and a more complex control strategy may not be warranted. We also propose a simple design, called Gradual Sleep, to reduce the energy impact of using the sleep mode for smaller idle periods

Infoscience - École polytechnique fédérale de Lausanne

Dynamically Tuning Processor Resources with Adaptive Processing

Author: Albonesi David H.
Balasubramonian Rajeev
Bose Pradip
Buyuktosunoglu Alper
Cook Peter W.
Dropsho Steven
Dwarkadas Sandhya
Friedman Eby G.
Huang Michael C.
Kursun Volkan
Magklis Grigorios
Schuster Stanley E.
Scott Michael L.
Semeraro Greg
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/12/2006
Field of study

The productivity of modern society has become inextricably linked to its ability to produce energy-efficient computing technology. Increasingly sophisticated mobile computing systems, powered for hours solely by batteries, continue to proliferate rapidly throughout society, while battery technology improves at a much slower pace. In large data centers that handle everything from online orders for a dot-com company to sophisticated Web searches, row upon row of tightly packed computers may be warehoused in a city block. Microprocessor energy wastage in such a facility directly translates into higher electric bills. Simply receiving sufficient electricity from utilities to power such a center is no longer certain. Given this situation, energy efficiency has rapidly moved to the forefront of modern microprocessor design. The adaptive processing approach to improving microprocessor energy efficiency dynamically tunes major microprocessor resources—such as caches and hardware queues—during execution to better match varying application needs.1,2 This tuning usually involves reducing the size of a resource when its full capabilities are not needed, then restoring the disabled portions when they are needed again. Dynamically tailoring processor resources in active use contrasts sharply with techniques that simply turn off entire sections of a processor when they become idle. Presenting the application with the required amount of hardware—and nothing more— throughout its execution can achieve a potentially significant reduction in energy consumption. The challenges facing adaptive processing lie in achieving this greater efficiency with reasonable hardware and software overhead, and doing so without incurring undue performance loss. Unlike reconfigurable computing, which typically uses very different technology such as FPGAs, adaptive processing exploits the dynamic superscalar design approach that developers have used successfully in many generations of general-purpose processors. Whereas reconfigurable processors must demonstrate performance or energy savings large enough to overcome very large clock frequency and circuit density disadvantages, adaptive processors typically have baseline overheads of only a few percent

Infoscience - École polytechnique fédérale de Lausanne

Enhancing Branch Prediction via On-Line Statistical Analysis

Author: Kathryn S. Mckinley
Steven G. Dropsho
Publication venue
Publication date
Field of study

To attain peak efficiency, high performance processors must anticipate changes in the flow of control before they actually occur. Branch prediction is the method of determining the most likely path to be taken at branch decision points in the program. Many branch prediction mechanisms have been proposed. The most effective of these use a single tabular data structure in hardware to hold historical information regarding the behaviors of branches. The table is a limited resource, and is managed in a manner that can result in multiple branches attempting to share locations in the table in a conflicting manner

CiteSeerX

Miss Rate Prediction across All Program Inputs

Author: Chen Ding
Steven G. Dropsho
Yutao Zhong
Publication venue
Publication date
Field of study

CiteSeerX

Miss rate prediction across program inputs and cache configurations

Author: Ahren Studer
Chen Ding
Steven G. Dropsho
Xipeng Shen
Yutao Zhong
Publication venue
Publication date: 01/12/2006
Field of study

Improving cache performance requires understanding cache behavior. However, measuring cache performance for one or two data input sets provides little insight into how cache behavior varies across all data input sets and all cache configurations. This paper uses locality analysis to generate a parameterized model of program cache behavior. Given a cache size and associativity, this model predicts the miss rate for arbitrary data input set sizes. This model also identifies critical data input sizes where cache behavior exhibits marked changes. Experiments show this technique is within 2 percent of the hit rate for set associative caches on a set of floating-point and integer programs using array and pointer-based data structures. Building on the new model, this paper presents an interactive visualization tool that uses a three-dimensional plot to show miss rate changes across program data sizes and cache sizes and its use in evaluating compiler transformations. Other uses of this visualization tool include assisting machine and benchmark-set design. The tool can be accessed on the Web a

Infoscience - École polytechnique fédérale de Lausanne

CiteSeerX

Linguistic Support for Heterogeneous Parallel Processing: A Survey and an Approach

Author: Charles C. Weems
Glen E. Weaver
Steven G. Dropsho
Publication venue
Publication date
Field of study

Coding a highly parallel application to run on a heterogeneous suite of processors (both metacomputers and mixed-mode computers) with high efficiency, ease of implementation, and portability is a significant challenge. This paper first surveys recently proposed and existing parallel languages from the perspective of programming complex, heterogeneous systems. We then propose two essential features to be included in programming languages that are intended to support heterogeneity. * 1. Introduction Recent examples have shown the success of combining heterogeneous computing hardware to solve complex problems[1, 2]. When the architecture of the machine matches the structure of the problem, the algorithmic solution is often easier to develop and can execute more efficiently. The currently popular practice of networking existing machines is merely a beginning [3]. As hardware continues to diminish in size and cost, new possibilities are being created for systems that are heterogeneous by..

CiteSeerX

Miss Rate Prediction Across Program Inputs and Cache Configurations

Author: Ahren Studer
Chen Ding
Steven G. Dropsho
Xipeng Shen
Yutao Zhong
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref