1,496 research outputs found

    Low Latency Prefix Accumulation Driven Compound MAC Unit for Efficient FIR Filter Implementation

    Get PDF
    135–138This article presents hierarchical single compound adder-based MAC with assertion based error correction for speculation variations in the prefix addition for FIR filter design. The VLSI implementation of approximation in prefix adder results show a significant delay and complexity reductions, all this at the cost of latency measures when speculation fails during carry propagation, which is the main reason preventing the use of speculation in parallel-prefix adders in DSP applications. The speculative adder which is based on Han Carlson parallel prefix adder structure accomplishes better reduction in latency. Introducing a structured and efficient shift-add technique and explore latency reduction by incorporating approximation in addition. The improvements made in terms of reduction in latency and merits in performance by the proposed MAC unit are showed through the synthesis done by FPGA hardware. Results show that proposed method outpaces both formerly projected MAC designs using multiplication methods for attaining high speed

    DESIGN OF LOW POWER AND HIGH SPEED CARRY SELECT ADDER USING BRENT KUNG ADDER

    Get PDF
    In this paper, Carry Select Adder (CSA) architectures are proposed using parallel prefix adders. Instead of using dual Ripple Carry Adders (RCA), parallel prefix adder i.e., Brent Kung (BK) adder is used to design Regular Linear CSA. Adders are the basic building blocks in digital integrated circuit based designs. Ripple Carry Adder (RCA) gives the most compact design but takes longer computation time. The time critical applications use Carry Look-ahead scheme (CLA) to derive fast results but they lead to increase in area. Carry SelectAdder is a compromise between RCA and CLA in term of area and delay. Delayof RCA is large therefore we have replaced it with parallel prefix adder which gives fast results. In this paper, structures of 16-Bit Regular Linear Brent Kung CSA, Modified Linear BK CSA, Regular Square Root (SQRT) BK CSA andModified SQRT BK CSA are designed. Power and delayof all these adder architectures are calculated at different input voltages. The results depict that Modified SQRT BK CSA is better than all the other adder architectures in terms of power but with small speed penalty. The designs have been synthesized at45nm technology using Tanner EDA tool

    Low Latency Prefix Accumulation Driven Compound MAC Unit for Efficient FIR Filter Implementation

    Get PDF
    This article presents hierarchical single compound adder-based MAC with assertion based error correction for speculation variations in the prefix addition for FIR filter design. The VLSI implementation of approximation in prefix adder results show a significant delay and complexity reductions, all this at the cost of latency measures when speculation fails during carry propagation, which is the main reason preventing the use of speculation in parallel-prefix adders in DSP applications. The speculative adder which is based on Han Carlson parallel prefix adder structure accomplishes better reduction in latency. Introducing a structured and efficient shift-add technique and explore latency reduction by incorporating approximation in addition. The improvements made in terms of reduction in latency and merits in performance by the proposed MAC unit are showed through the synthesis done by FPGA hardware. Results show that proposed method outpaces both formerly projected MAC designs using multiplication methods for attaining high speed

    Pipelining Saturated Accumulation

    Get PDF
    Aggressive pipelining and spatial parallelism allow integrated circuits (e.g., custom VLSI, ASICs, and FPGAs) to achieve high throughput on many Digital Signal Processing applications. However, cyclic data dependencies in the computation can limit parallelism and reduce the efficiency and speed of an implementation. Saturated accumulation is an important example where such a cycle limits the throughput of signal processing applications. We show how to reformulate saturated addition as an associative operation so that we can use a parallel-prefix calculation to perform saturated accumulation at any data rate supported by the device. This allows us, for example, to design a 16-bit saturated accumulator which can operate at 280 MHz on a Xilinx Spartan-3(XC3S-5000-4) FPGA, the maximum frequency supported by the component's DCM

    Optimistic Parallelization of Floating-Point Accumulation

    Get PDF
    Floating-point arithmetic is notoriously non-associative due to the limited precision representation which demands intermediate values be rounded to fit in the available precision. The resulting cyclic dependency in floating-point accumulation inhibits parallelization of the computation, including efficient use of pipelining. In practice, however, we observe that floating-point operations are "mostly" associative. This observation can be exploited to parallelize floating-point accumulation using a form of optimistic concurrency. In this scheme, we first compute an optimistic associative approximation to the sum and then relax the computation by iteratively propagating errors until the correct sum is obtained. We map this computation to a network of 16 statically-scheduled, pipelined, double-precision floating-point adders on the Virtex-4 LX160 (-12) device where each floating-point adder runs at 296 MHz and has a pipeline depth of 10. On this 16 PE design, we demonstrate an average speedup of 6Ă— with randomly generated data and 3-7Ă— with summations extracted from Conjugate Gradient benchmarks

    Comparison of parallel prefix adder (PPA)

    Get PDF
    The parallel prefix adder is one of the fastest types of adder that had been created and developed. Two common types of parallel prefix adder are the Brent Kung and Kogge Stone adders. This project will compare and study the performances of these two adders in terms of propagation delay and design area. The comparison and study for both adders will be conducted for 8, 16 and 32 bits size. By using the Quartus II design software, the designs for both Brent Kung and Kogge Stone adders will be developed. From the designs of both adders, a vector waveform will be produced as a result of the designs’ simulations. The simulation result will also show the propagation delay for the adders. Hence, this project is significant in showing which of the two adders being tested perform better in terms of propagation delay and design area based on different sizes of bits

    Pipelined Two-Operand Modular Adders

    Get PDF
    Pipelined two-operand modular adder (TOMA) is one of basic components used in digital signal processing (DSP) systems that use the residue number system (RNS). Such modular adders are used in binary/residue and residue/binary converters, residue multipliers and scalers as well as within residue processing channels. The design of pipelined TOMAs is usually obtained by inserting an appriopriate number of latch layers inside a nonpipelined TOMA structure. Hence their area is also determined by the number of latches and the delay by the number of latch layers. In this paper we propose a new pipelined TOMA that is based on a new TOMA, that has the smaller area and smaller delay than other known structures. Comparisons are made using data from the very large scale of integration (VLSI) standard cell library
    • …
    corecore