105 research outputs found

    Design and Analysis of Low Power Hybrid Braun Multiplier using Ladner Fischer Adder

    Get PDF
    Multiplier is important in many DSP systems and in many hardware blocks. Multiplier are used in various DSP application like digital filtering, digital communication. This needs parallel array multiplier to attain high speed for execution and better performance. A specific array multiplier is implemented known as Braun design. Braun multiplier is the one which is a kind of parallel multiplier. It contains different CSA count of AND gates. Braun multiplier employing Ripple Carry Adder is developed here having high speed PPA. It will reduce the delay and implemented using Tanner EDA tool

    DESIGN OF NEW HIGH SPEED MULTI OUTPUT CARRY LOOK-AHEAD ADDERS

    Get PDF
    The carry look-ahead adders are designed till now by using standard 4 bit Manchester carry chain. Due to its limited carry chain length, the carries of the adders are computed using 4 bit carry chain. This leads to slow down the operation. A high speed 8 bit (MCC) adder in multi output domino CMOS logic is designed in this thesis. Due to its limited carry chain length this high speed MCC uses 2 separate 4-bit MCC. The 2 MCC namely odd carry chain and even carry chain are computed in parallel to increase the speed of the operation. This technique has been applied for the design of 8 bit adders in multi output domino logic and the simulation results are verified. Results prove that 8 bit MCC produces less delay compared to conventional 4 bit delay. The reduced delay realizes better speed compared to the conventional designs. The existing design and the previous designs are designed and simulated using Mentor Graphics. The delay of these designs is compared with 8 bit input and with 50 nm technology file. Implementation results reveal that the high speed comparator has delay of 37.47% less compared to the conventional designs used for comparison when operated at 50 MHz

    Design of High-Speed Adders for Efficient Digital Design Blocks

    Get PDF

    Parallel-prefix structures for binary and modulo {2n - 1, 2n, 2n + 1} adders

    Get PDF
    Adders are the among the most essential arithmetic units within digital systems. Parallel-prefix structures are efficient for adders because of their regular topology and logarithmic delay. However, building parallel-prefix adders are barely discussed in literature. This work puts emphasis on how to build prefix trees and simple algorithms for building these architectures. One particular modification of adders is for use with modulo arithmetic. The most common type of modulo adders are modulo 2n -1 and modulo 2n + 1 adders because they have a common base that is a power of 2. In order to improve their speed, parallel-prefix structures can also be employed for modulo 2n +- 1 adders. This dissertation presents the formation of several binary and modulo prefix architectures and their modifications using Ling's algorithm. For all binary and modulo adders, both algorithmic and quantitative analysis are provided to compare the performance of different architectures. Furthermore, to see how process impact the design, three technologies, from deep submicron to nanometer range, are utilized to collect the quantitative data

    IEEE Compliant Double-Precision FPU and 64-bit ALU with Variable Latency Integer Divider

    Get PDF
    Together the arithmetic logic unit (ALU) and floating-point unit (FPU) perform all of the mathematical and logic operations of computer processors. Because they are used so prominently, they fall in the critical path of the central processing unit - often becoming the bottleneck, or limiting factor for performance. As such, the design of a high-speed ALU and FPU is vital to creating a processor capable of performing up to the demanding standards of today\u27s computer users. In this paper, both a 64-bit ALU and a 64-bit FPU are designed based on the reduced instruction set computer architecture. The ALU performs the four basic mathematical operations - addition, subtraction, multiplication and division - in both unsigned and two\u27s complement format, basic logic operations and shifting. The division algorithm is a novel approach, using a comparison multiples based SRT divider to create a variable latency integer divider. The floating-point unit performs the double-precision floating-point operations add, subtract, multiply and divide, in accordance with the IEEE 754 standard for number representation and rounding. The ALU and FPU were implemented in VHDL, simulated in ModelSim, and constrained and synthesized using Synopsys Design Compiler (2006.06). They were synthesized using TSMC 0.1 3nm CMOS technology. The timing, power and area synthesis results were recorded, and, where applicable, compared to those of the corresponding DesignWare components.The ALU synthesis reported an area of 122,215 gates, a power of 384 mW, and a delay of 2.89 ns - a frequency of 346 MHz. The FPU synthesis reported an area 84,440 gates, a delay of 2.82 ns and an operating frequency of 355 MHz. It has a maximum dynamic power of 153.9 mW

    Design of ALU and Cache Memory for an 8 bit ALU

    Get PDF
    The design of an ALU and a Cache memory for use in a high performance processor was examined in this thesis. Advanced architectures employing increased parallelism were analyzed to minimize the number of execution cycles needed for 8 bit integer arithmetic operations. In addition to the arithmetic unit, an optimized SRAM memory cell was designed to be used as cache memory and as fast Look Up Table. The ALU consists of stand alone units for bit parallel computation of basic integer arithmetic operations. Addition and subtraction were performed using Kogge Stone parallel prefix hardware operating at 330MHz. A high performance multiplier was built using Radix 4 Modified Booth Encoder (MBE) and a Wallace Tree summation array. The multiplier requires single clock cycle for 8 bit integer multiplication and operates at a maximum frequency of 100MHz. Multiplicative division hardware was built for executing both integer division and square root. The division hardware computes 8-bit division and square root in 4 clock cycles. Multiplier forms the basic building block of all these functional units, making high level of resource sharing feasible with this architecture. The optimal operating frequency for the arithmetic unit is 70MHz. A 6T CMOS SRAM cell measuring 90 µm2 was designed using minimum size transistors. The layout allows for horizontal overlap resulting in effective area of 76 µm2 for an 8x8 array. By substituting equivalent bit line capacitance of P4 L1 Cache, the memory was simulated to have a read time of 3.27ns. An optimized set of test vectors were identified to enable high fault coverage without the need for any additional test circuitry. Sixteen test cases were identified that would toggle all the nodes and provide all possible inputs to the sub units of the multiplier. A correlation based semi automatic method was investigated to facilitate test case identification for large multipliers. This method of testability eliminates performance and area overhead associated with conventional testability hardware. Bottom up design methodology was employed for the design. The performance and area metrics are presented along with estimated power consumption. A set of Monte Carlo analysis was carried out to ensure the dependability of the design under process variations as well as fluctuations in operating conditions. The arithmetic unit was found to require a total die area of 2mm2 (approx.) in 0.35 micron process
    corecore