602 research outputs found
Type-II/III DCT/DST algorithms with reduced number of arithmetic operations
We present algorithms for the discrete cosine transform (DCT) and discrete
sine transform (DST), of types II and III, that achieve a lower count of real
multiplications and additions than previously published algorithms, without
sacrificing numerical accuracy. Asymptotically, the operation count is reduced
from ~ 2N log_2 N to ~ (17/9) N log_2 N for a power-of-two transform size N.
Furthermore, we show that a further N multiplications may be saved by a certain
rescaling of the inputs or outputs, generalizing a well-known technique for N=8
by Arai et al. These results are derived by considering the DCT to be a special
case of a DFT of length 4N, with certain symmetries, and then pruning redundant
operations from a recent improved fast Fourier transform algorithm (based on a
recursive rescaling of the conjugate-pair split radix algorithm). The improved
algorithms for DCT-III, DST-II, and DST-III follow immediately from the
improved count for the DCT-II.Comment: 9 page
Fast Fourier Transform algorithm design and tradeoffs
The Fast Fourier Transform (FFT) is a mainstay of certain numerical techniques for solving fluid dynamics problems. The Connection Machine CM-2 is the target for an investigation into the design of multidimensional Single Instruction Stream/Multiple Data (SIMD) parallel FFT algorithms for high performance. Critical algorithm design issues are discussed, necessary machine performance measurements are identified and made, and the performance of the developed FFT programs are measured. Fast Fourier Transform programs are compared to the currently best Cray-2 FFT program
Non-power-of-Two FFTs: Exploring the Flexibility of the Montium TP
Coarse-grain reconfigurable architectures, like the Montium TP, have proven to be a very successful approach for low-power and high-performance computation of regular digital signal processing algorithms. This paper presents the implementation of a class of non-power-of-two FFTs to discover the limitations and Flexibility of the Montium TP for less regular algorithms. A non-power-of-two FFT is less regular compared to a traditional power-of-two FFT. The results of the implementation show the processing time, accuracy, energy consumption and Flexibility of the implementation
Efficient FPGA implementation of high-throughput mixed radix multipath delay commutator FFT processor for MIMO-OFDM
This article presents and evaluates pipelined architecture designs for an improved high-frequency Fast Fourier
Transform (FFT) processor implemented on Field Programmable Gate Arrays (FPGA) for Multiple Input Multiple Output
Orthogonal Frequency Division Multiplexing (MIMO-OFDM). The architecture presented is a Mixed-Radix Multipath Delay
Commutator. The presented parallel architecture utilizes fewer hardware resources compared to Radix-2 architecture,
while maintaining simple control and butterfly structures inherent to Radix-2 implementations. The high-frequency
design presented allows enhancing system throughput without requiring additional parallel data paths common in
other current approaches, the presented design can process two and four independent data streams in parallel
and is suitable for scaling to any power of two FFT size N. FPGA implementation of the architecture demonstrated
significant resource efficiency and high-throughput in comparison to relevant current approaches within
literature. The proposed architecture designs were realized with Xilinx System Generator (XSG) and evaluated
on both Virtex-5 and Virtex-7 FPGA devices. Post place and route results demonstrated maximum frequency
values over 400 MHz and 470 MHz for Virtex-5 and Virtex-7 FPGA devices respectively
Generating and Searching Families of FFT Algorithms
A fundamental question of longstanding theoretical interest is to prove the
lowest exact count of real additions and multiplications required to compute a
power-of-two discrete Fourier transform (DFT). For 35 years the split-radix
algorithm held the record by requiring just 4n log n - 6n + 8 arithmetic
operations on real numbers for a size-n DFT, and was widely believed to be the
best possible. Recent work by Van Buskirk et al. demonstrated improvements to
the split-radix operation count by using multiplier coefficients or "twiddle
factors" that are not n-th roots of unity for a size-n DFT. This paper presents
a Boolean Satisfiability-based proof of the lowest operation count for certain
classes of DFT algorithms. First, we present a novel way to choose new yet
valid twiddle factors for the nodes in flowgraphs generated by common
power-of-two fast Fourier transform algorithms, FFTs. With this new technique,
we can generate a large family of FFTs realizable by a fixed flowgraph. This
solution space of FFTs is cast as a Boolean Satisfiability problem, and a
modern Satisfiability Modulo Theory solver is applied to search for FFTs
requiring the fewest arithmetic operations. Surprisingly, we find that there
are FFTs requiring fewer operations than the split-radix even when all twiddle
factors are n-th roots of unity.Comment: Preprint submitted on March 28, 2011, to the Journal on
Satisfiability, Boolean Modeling and Computatio
A class of AM-QFT algorithms for power-of-two FFT
This paper proposes a class of power-of-two FFT (Fast Fourier Transform)
algorithms, called AM-QFT algorithms, that contains the improved QFT (Quick
Fourier Transform), an algorithm recently published, as a special case. The
main idea is to apply the Amplitude Modulation Double Sideband - Suppressed
Carrier (AM DSB-SC) to convert odd-indices signals into even-indices signals,
and to insert this elaboration into the improved QFT algorithm, substituting
the multiplication by secant function. The 8 variants of this class are
obtained by re-elaboration of the AM DSB-SC idea, and by means of duality. As a
result the 8 variants have both the same computational cost and the same memory
requirements than improved QFT. Differently, comparing this class of 8 variants
of AM-QFT algorithm with the split-radix 3add/3mul (one of the most performing
FFT approach appeared in the literature), we obtain the same number of
additions and multiplications, but employing half of the trigonometric
constants. This makes the proposed FFT algorithms interesting and useful for
fixed-point implementations. Some of these variants show advantages versus the
improved QFT. In fact one of this variant slightly enhances the numerical
accuracy of improved QFT, while other four variants use trigonometric constants
that are faster to compute in `on the fly' implementations
An efficient multiplierless approximation of the fast Fourier transform using sum-of-powers-of-two (SOPOT) coefficients
This letter proposes a new multiplierless approximation of the discrete Fourier transform (DFT) called the multiplierless fast Fourier transform-like (ML-FFT) transformation. It makes use of a novel factorization to parameterize the twiddle factors in the conventional radix-2 n or split-radix FFT algorithms as certain rotation-like matrices and approximates the associated parameters using the sum-of-powers-of-two (SOPOT) or canonical signed digits (CSD) representations. The ML-FFT converges to the DFT when the number of SOPOT terms used increases and has an arithmetic complexity of O(N log 2 N) additions, where N = 2 m is the transform length. Design results show that the ML-FFT offers flexible tradeoff between arithmetic complexity and numerical accuracy in approximating the DFT.published_or_final_versio
- …