694 research outputs found

    A Novel VLSI Design On CSKA Of Binary Tree Adder With Compaq Area And High Throughput

    Get PDF
    Addition is one of the most basic operations performed in all computing units, including microprocessors and digital signal processors. It is also a basic unit utilized in various complicated algorithms of multiplication and division. Efficient implementation of an adder circuit usually revolves around reducing the cost to propagate the carry between successive bit positions. Multi-operand adders are important arithmetic design blocks especially in the addition of partial products of hardware multipliers. The multi-operand adders (MOAs) are widely used in the modern low-power and high-speed portable very-large-scale integration systems for image and signal processing applications such as digital filters, transforms, convolution neural network architecture. Hence, a new high-speed and area efficient adder architecture is proposed using pre-compute bitwise addition followed by carry prefix computation logic to perform the three-operand binary addition that consumes substantially less area, low power and drastically reduces the adder delay. Further, this project is enhanced by using Modified carry bypass adder to further reduce more density and latency constraints. Modified carry skip adder introduces simple and low complex carry skip logic to reduce parameters constraints. In this proposal work, designed binary tree adder (BTA) is analyzed to find the possibilities for area minimization. Based on the analysis, critical path of carry is taken into the new logic implementation and the corresponding design of CSKP are proposed for the BTA with AOI, OAI

    Power Efficient and High Speed Carry Skip Adder using Binary to Excess One Converter

    Get PDF
    The design of high-speed and low-power VLSI architectures need efficient arithmetic processing units, which are optimized for the performance parameters, namely, speed and power consumption. Adders are the key components in general purpose microprocessors and digital signal processors. As a result, it is very pertinent that its performance augers well for their speed performance. Additionally, the area is an essential factor which is to be taken into account in the design of fast adders. Towards this end, high-speed, low power and area efficient addition and multiplication have always been a fundamental requirement of high-performance processors and systems. The major speed limitation of adders arises from the huge carry propagation delay encountered in the conventional adder circuits, such as ripple carry adder and carry save adder. Observing that a carry may skip any addition stages on certain addend and augend bit values, researchers developed the carry-skip technique to speed up addition in the carry-ripple adder. Using a multilevel structure, carry-skip logic determines whether a carry entering one block may skip the next group of blocks. Because multilevel skip logic introduces longer delays, Therefore, in this paper we examine The basic idea of this work is to use Binary to Excess- 1 converter (BEC) instead of RCA with Cin=1 in conventional CSkA in order to reduce the area and power. BEC uses less number of logic gates than N-bit full adder

    High Speed and Low Power Consumption Carry Skip Adder using Binary to Excess-One Converter

    Get PDF
    Arithmetic and Logic Unit (ALU) is a vital component of any CPU. In ALU, adders play a major role not only in addition but also in performing many other basic arithmetic operations like subtraction, multiplication, etc. Thus realizing an efficient adder is required for better performance of an ALU and therefore the processor. For the optimization of speed in adders, the most important factor is carry generation. For the implementation of a fast adder, the generated carry should be driven to the output as fast as possible, thereby reducing the worst path delay which determines the ultimate speed of the digital structure. In conventional carry skip adder the multiplexer is used as a skip logic that provides a better performance and performs an efficient operation with the minimum circuitry. Even though, it affords a significant advantages there may be a large critical path delay revealed by the multiplexer that leads to increase of area usage and power consumption. The basic idea of this paper is to use Binary to Excess-1 Converters (BEC) to achieve lower area and power consumption

    Internet of Things Based Reconfigurable SIMD Processor for High-Speed End Devices in FPGA

    Get PDF
    This research article proposed the reconfigurable Single Instruction Multi Data (SIMD) processor design to speed up the accelerated computing task in IoT operations. Single Instruction Multi Data models leverage the parallel real source to speed up computing accelerated tasks. It proposes the utilization of reconfigurable Kogge Stone-dependent hybrid adder structures, now referred to as KS-CPA, in which reconfiguration occurs during the addition operation. The Least Significant Bits (LSB) are processed using a carry propagate adder, while the Most Significant Bits (MSB) are computed using the Kogge Stone adder. Depending on the data width and device-accessible energy resources, the hybrid configuration of the adder offers the 4-bit, 8-bit, and 16-bit addition. The adder form is identified by a shift in the configuration of its Carry Look-ahead and then by a Kogge Stone Adder (KSA). Throughout the activity, the KS-CLA crossbreed configuration is used to attain the fastest speed and low energy usage. The effectiveness, including its proposed hybrid adder, is evaluated by looking at the speed, energy, and area parameters, including a suitable area use during rapid applications in which both less delay and low power adders are required. Considering these, we are structuring an IoT processor that can be reconfigured to gain from SIMD. We have demonstrated that our hybrid adder-enhanced processor saves energy up to 13% and reduces 27% latency. The proposed 16 and 32-bit adders will boost time, power, and Area Delay Product (ADP) by almost 18-24% and 13-19% respectively

    Design of Static Segment Adder for Approximating Computing Applications

    Get PDF
    The digital VLSI design needs to attain high performance with desired reliability range. The high performance involves low power, area efficiency and high speed. This paper proposes a design of High speed energy efficient static segment adder (SSA) to enhance the overall performance based on approximation technique. Static segmentation includes both accurate and inaccurate part. The normal full adder performs accurate part and the carry select adder is used for inaccurate part. By using static segmentation the approximate computation is done. Approximate computing is a computation which generates “good enough” result rather than totally accurate result. Image processing is accomplished using SSA design. In this process 99.4% whole computational accuracy for 16 bit addition and also for 8 bit addition can be achieved

    4-Bit Adder Design and Simulation

    Get PDF
    The project is to design a 4-bit digital adder, while taking care of performance parameters: area, speed and power consumption, the team has chosen to design according to the cost function: Area*Delay*Power. The project is implemented in three phases: research phase, simulation phase, and evaluation/re-evaluation phase. The adder circuit implemented as Ripple-Carry Adder (RCA), the team added improvements to overcome the disadvantages of the RCA architecture, for instance the first 1-bit adder is a Half Adder, which is faster and more power-efficient, the team was also carefully choosing the gates to match the stated cost function. Gates are implemented using different logic families, according to each gate usage and functionality in the circuit in order to achieve the desired performance. Transistor sizes are also selected based upon simulation and optimization, to reach the needed performance according to the specified cost function. The team was able to reach a 4-bit ripple carry adder that has delay of 1.22 ns with 0.6 uW power consumption (measured at 10 MHz), with 109 transistors. In the re-evaluation phase, the team was able to further improve this to reach 0.99 ns delay with 0.25 uW power consumption (10 MHz) with 97 transistors only

    Design of ALU and Cache Memory for an 8 bit ALU

    Get PDF
    The design of an ALU and a Cache memory for use in a high performance processor was examined in this thesis. Advanced architectures employing increased parallelism were analyzed to minimize the number of execution cycles needed for 8 bit integer arithmetic operations. In addition to the arithmetic unit, an optimized SRAM memory cell was designed to be used as cache memory and as fast Look Up Table. The ALU consists of stand alone units for bit parallel computation of basic integer arithmetic operations. Addition and subtraction were performed using Kogge Stone parallel prefix hardware operating at 330MHz. A high performance multiplier was built using Radix 4 Modified Booth Encoder (MBE) and a Wallace Tree summation array. The multiplier requires single clock cycle for 8 bit integer multiplication and operates at a maximum frequency of 100MHz. Multiplicative division hardware was built for executing both integer division and square root. The division hardware computes 8-bit division and square root in 4 clock cycles. Multiplier forms the basic building block of all these functional units, making high level of resource sharing feasible with this architecture. The optimal operating frequency for the arithmetic unit is 70MHz. A 6T CMOS SRAM cell measuring 90 µm2 was designed using minimum size transistors. The layout allows for horizontal overlap resulting in effective area of 76 µm2 for an 8x8 array. By substituting equivalent bit line capacitance of P4 L1 Cache, the memory was simulated to have a read time of 3.27ns. An optimized set of test vectors were identified to enable high fault coverage without the need for any additional test circuitry. Sixteen test cases were identified that would toggle all the nodes and provide all possible inputs to the sub units of the multiplier. A correlation based semi automatic method was investigated to facilitate test case identification for large multipliers. This method of testability eliminates performance and area overhead associated with conventional testability hardware. Bottom up design methodology was employed for the design. The performance and area metrics are presented along with estimated power consumption. A set of Monte Carlo analysis was carried out to ensure the dependability of the design under process variations as well as fluctuations in operating conditions. The arithmetic unit was found to require a total die area of 2mm2 (approx.) in 0.35 micron process
    corecore