Search CORE

4 research outputs found

Pipelining Saturated Accumulation

Author: Chan Stephanie
DeHon André
Kapre Nachiket
Papadantonakis Karl
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 02/04/2008
Field of study

Aggressive pipelining and spatial parallelism allow integrated circuits (e.g., custom VLSI, ASICs, and FPGAs) to achieve high throughput on many Digital Signal Processing applications. However, cyclic data dependencies in the computation can limit parallelism and reduce the efficiency and speed of an implementation. Saturated accumulation is an important example where such a cycle limits the throughput of signal processing applications. We show how to reformulate saturated addition as an associative operation so that we can use a parallel-prefix calculation to perform saturated accumulation at any data rate supported by the device. This allows us, for example, to design a 16-bit saturated accumulator which can operate at 280 MHz on a Xilinx Spartan-3(XC3S-5000-4) FPGA, the maximum frequency supported by the component's DCM

CiteSeerX

Caltech Authors

EXPERIMENTAL STUDIES ON MULTI-OPERAND ADDERS

Author: Saha P.
Sonowal M.
Thabah S. D.
Publication venue: 'Exeley, Inc.'
Publication date
Field of study

Exeley Inc.

Pipelining saturated accumulation

Author: Andre DeHon
Karl Papadantonakis
Nachiket Kapre
Stephanie Chan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Design alternatives for parallel saturating multioperand adders

Author: Jie Ruan
Michael J. Schulte
Pablo I. Balzola
Publication venue
Publication date
Field of study

Parallel saturating multioperand adders significantly improve the performance of GSM speech coders by giving compilers and assembly language programmers the ability to parallelize loops containing saturating dot products, while maintaining GSM compliant results. This paper presents four designs for parallel saturating multioperand adders. These designs have at most one carry-propagate adder on their critical delay path, yet produce the same results that would be obtained if the additions were performed serially with saturation after each addition. The four parallel designs offer tradeoffs in terms of area, worst case delay, and dot product latency. Compared to a 5-input serial design, the 5-input parallel designs have delays up to 3.51 times shorter. 1

CiteSeerX