2 research outputs found

    Resource Banking An Energy-efficient, Run-time Adaptive Processor Design Technique

    Get PDF
    From the earliest and simplest scalar computation engines to modern superscalar out-oforder processors, the evolution of computational machinery during the past century has largely been driven by a single goal: performance. In today’s world of cheap, billion-plus transistor count processors and with an exploding market in mobile computing, a design landscape has emerged where energy efficiency, arguably more than any other single metric, determines the viability of a processor for a given application. The historical emphasis on performance has left modern processors bloated and over provisioned for everyday tasks in the hope that during computationally intensive periods some performance improvement will be observed. This work explores an energy-efficient processor design technique that ensures even a highly over provisioned out-of-order processor has only as many of its computational resources active as it requires for efficient computation at any given time. Specifically, this paper examines the feasibility of a dynamically banked register file and reorder buffer with variable banking policies that enable unused rename registers or reorder buffer entries to be voltage gated (turned off) during execution to save power. The impact of bank placement, turn-off and turn-on policies as well as rail stabilization latencies for this approach are explored for high-performance desktop and server designs as well as low-power mobile processor

    Energy-efficient design of the reorder buffer

    No full text
    Abstract. Some of today’s superscalar processors, such as the Intel Pentium III, implement physical registers using the Reorder Buffer (ROB) slots. As much as 27% of the total CPU power is expended within the ROB in such designs, making the ROB a dominant source of power dissipation within the processor. This paper proposes three relatively independent techniques for the ROB power reduction with no or minimal impact on the performance. These techniques are: 1) dynamic ROB resizing; 2) the use of low–power comparators that dissipate energy mainly on a full match of the comparands and, 3) the use of zero–byte encoding. We validate our results by executing the complete suite of SPEC 95 benchmarks on a true cycle–by–cycle hardware–level simulator and using SPICE measurements for actual layouts of the ROB in 0.5 micron CMOS process. The total power savings achieved within the ROB using our approaches are in excess of 76 % with the average performance penalty of less than 3%.
    corecore