Search CORE

15 research outputs found

Implementation Of Less Area Inexact Speculative Adder Using Carry Look Ahead Adder

Author: MAYURI MIRIYALA
SALMA PATAN
Publication venue: International Journal of Innovative Technology and Research
Publication date: 01/12/2020
Field of study

The ISA improves snake performance by dividing the critical path into two or more shorter paths, reducing the strength of pseudo-defects and managing faults through an improved speculative path and versatile bi-directional error compensation technology. Pipelines are the process of shortening the critical path at the expense of the area. The overall structure of the runners improves performance and allows precise control of precision. This paper leads to the next snake-based Contemporary Estimation Theory (ISA) CLA plan, which consists of micropipelins to include two logic gates along their main path, thus enhancing replay activity. In addition, various stages of the ISA architecture have been proposed and the power clock has minimized this scheme. In mod we change Adder, we can replace Brent Kung Adder instead of CLA

International Journal of Innovative Technology and Research (IJITR)

Efficient register renaming and recovery for high-performance processors

Author: López Rodríguez Pedro Juan
Petit Martí Salvador Vicente
Sahuquillo Borrás Julio
Ubal Tena Rafael
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/07/2014
Field of study

© © 2014 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.”Modern superscalar processors implement register renaming using either random access memory (RAM) or content-addressable memories (CAM) tables. The design of these structures should address both access time and misprediction recovery penalty. Although direct-mapped RAMs provide faster access times, CAMs are more appropriate to avoid recovery penalties. The presence of associative ports in CAMs, however, prevents them from scaling with the number of physical registers and pipeline width, negatively impacting performance, area, and energy consumption at the rename stage. In this paper, we present a new hybrid RAM CAM register renaming scheme, which combines the best of both approaches. In a steady state, a RAM provides fast and energy-efficient access to register mappings. On misspeculation, a low-complexity CAM enables immediate recovery. Experimental results show that in a four-way state-ofthe- art superscalar processor, the new approach provides almost the same performance as an ideal CAM-based renaming scheme, while dissipating only between 17% and 26% of the original energy and, in some cases, consuming less energy than purely RAM-based renaming schemes. Overall, the silicon area required to implement the hybrid RAM CAM scheme does not exceed the area required by conventional renaming mechanisms.This work was supported in part by the Spanish MINECO under Grant TIN2012-38341-C04-01.Petit Martí, SV.; Ubal Tena, R.; Sahuquillo Borrás, J.; López Rodríguez, PJ. (2014). Efficient register renaming and recovery for high-performance processors. IEEE Transactions on Very Large Scale Integration (VLSI) Systems. 22(7):1506-1514. https://doi.org/10.1109/TVLSI.2013.2270001S1506151422

RiuNet

Energy-Efficient Digital Design Through Inexact and Approximate Arithmetic Circuits

Author: Camus Vincent
Enz Christian
Schlachter Jérémy
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 21/09/2015
Field of study

Infoscience - École polytechnique fédérale de Lausanne

Crossref

Energy-Efficient Inexact Speculative Adder with High Performance and Accuracy Control

Author: Camus Vincent
Enz Christian
Schlachter Jeremy
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 03/08/2015
Field of study

Inexact and approximate circuit design is a promising approach to improve performance and energy efficiency in technology-scaled and low-power digital systems. Such strategy is suitable for error-tolerant applications involving perceptive or statistical outputs. This paper presents a novel architecture of an Inexact Speculative Adder with optimized hardware efficiency and advanced compensation technique with either error correction or error reduction. This general topology of speculative adders improves performance and enables precise accuracy control. A brief design methodology and comparative study of this speculative adder are also presented herein, demonstrating power savings up to 26 % and energy-delay-area reductions up to 60% at equivalent accuracy compared to the state-of-the-art

Infoscience - École polytechnique fédérale de Lausanne

Crossref

Near/Sub-Threshold Circuits and Approximate Computing: The Perfect Combination for Ultra-Low-Power Systems

Author: Camus Vincent
Enz Christian
Schlachter Jérémy
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 05/11/2015
Field of study

While sub/near-threshold design offers the minimal power and energy consumption, such approach strongly deteriorates circuit performances and robustness against PVT (process/voltage/temperature) variations, leading to gigantic speed penalties and large silicon areas. Inexact and approximate circuit design can address these issues by trading calculation accuracy for better silicon area, circuit speed and even better power consumption. This paper reviews and proposes improvements for two approximate computing techniques applicable to arithmetic circuits: gate-level pruning and carry speculation. A critical study is then carried out considering several error metrics, and for the first time, those techniques are combined to produce approximate adders showing even higher gains at similar error levels. It is then shown that those techniques can be applied to a sub-threshold library to mitigate the large variability

Infoscience - École polytechnique fédérale de Lausanne

Utilizing timing error detection and recovery to dynamically improve superscalar processor performance

Author: Bezdek Mikel Anton
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2006
Field of study

To provide reliable execution, traditional design methodologies perform timing error avoidance. Worst case parameters are assumed when determining a processor\u27s operating frequency, allowing the maximum propagation delay through the system to be met. However, in practice the worst cases are rare, leading to a large amount of exploitable performance improvement if timing errors can be detected and recovered from. To this end, we propose a novel low cost scheme which allows a superscalar processor to dynamically tune its frequency past the worst case limit. When timing errors occur, they are detected and recovered from locally. Additionally, the number of errors that occur are monitored by one of several sampling methods. When the error rate becomes too high, leading to decreased performance, the frequency is scaled back. Experimental results show an average performance gain of 45% across all benchmark applications. The cost of implementing the error detection and recovery is kept modest by reusing the existing pipeline logic to detect the timing errors

Digital Repository @ Iowa State University (ISU)

Optimization for timing-speculated circuits by redundancy addition and removal

Author
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Superscalar Processor Performance Enhancement through Reliable Dynamic Clock Frequency Tuning

Author: Avirneni Naga
Bezdek Mikel
Somani Arun
Somani Arun
Subramanian Viswanathan
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2007
Field of study

Synchronous circuits are typically clocked considering worst case timing paths so that timing errors are avoided under all circumstances. In the case of a pipelined processor, this has special implications since the operating frequency of the entire pipeline is limited by the slowest stage. Our goal, in this paper, is to achieve higher performance in superscalar processors by dynamically varying the operating frequency during run time past worst case limits. The key objective is to see the effect of overclocking on superscalar processors for various benchmark applications, and analyze the associated overhead, in terms of extra hardware and error recovery penalty, when the clock frequency is adjusted dynamically. We tolerate timing errors occurring at speeds higher than what the circuit is designed to operate at by implementing an efficient error detection and recovery mechanism. We also study the limitations imposed by minimum path constraints on our technique. Experimental results show that an average performance gain up to 57% across all benchmark applications is achievable

Digital Repository @ Iowa State University (ISU)

Crossref

High Performance Reliable Variable Latency Carry Select Addition

Author: Du Kai
Publication venue
Publication date: 01/01/2012
Field of study

This thesis describes the design and the optimization of a low overhead, high performance variable latency carry select adder. Previous researchers believed that the traditional adder has reached the theoretical speed bound. However, a considerable portion of hardware resources of the traditional adder is only used in the worst case. Based on this observation, variable latency adders have been proposed to improve on the theoretical limit, but such adders incur significant area overhead. By combining previous variable latency adders with carry select addition, this work describes a novel variable latency carry select adder. Applying carry select addition in the variable latency adder design significantly reduces the area overhead and increases its performance. This variable latency adder is faster and smaller than previous variable latency adders. Furthermore, this variable latency adder can be optimized to be faster and smaller than the fastest adder generated by the Synopsys DesignWare building block IP

DSpace at Rice University