Search CORE

41,326 research outputs found

Parallel Processing of the Fast Fourier Transform

Author: Zhou Ke
Publication venue: Open PRAIRIE: Open Public Research Access Institutional Repository and Information Exchange
Publication date: 01/01/1988
Field of study

The first FFT algorithms were reported by Runge and Konig in 1924, and by Danielson and Lanczos in 1942. However, the FFT didn\u27t receive much attention at all until Cooley and Tukey published their algorithm in 1965. The Cooley-Tukey algorithm is simple and widely used in many application software packages. Winograd developed his FFT in 1976, which is based upon the prime factor theory. It is typically faster than the Cooley-Tukey Algorithm, if the computer system has no multiplication instructions. According to the book prepared by the Digital Signal Processing Committee of the IEEE in 1979, the speed difference among these FFT algorithms is around 40%. My objective in this paper is to choose a proper algorithm, establish the appropriate programming techniques, and determine the sequence of steps required to implement a FFT both on a conventional IBM-PC and a Vector Processor (VP) system. I will demonstrate how to vectorize a FFT so that the algorithm can be performed under a VP system. The analysis of data dependence in an algorithm is another important part of this paper. The paper includes the analysis of the Cooley-Tukey and Winograd FFT algorithms. The Prime factor method will be used in these two FFTs. It will be seen that the Cooley-Tukey Algorithm can be more easily implemented on a vector system and needs fewer memory locations. The details of\u27 the Winograd FFT algorithm can be found in. In addition, this paper has two Cooley-Tukey FFTs and one DFT program written in Assembly Language. One of two FFT programs has been tested and executed on a conventional IBM-PC which has an Intel-8088 processor as the Central Processing Unit, and one Intel-8087 Numeric Data Processor. The 8087 is specially designed to perform real number operations efficiently and quickly. Because of the special architecture of the 8087, single or double precision can be easily processed. The tested program was compiled and linked by Microsoft Assembly Language version 5.0 and the required results of both the FFT and Inverse FFT were obtained

Public Research Access Institutional Repository and Information Exchange

Design and implementation of a fast Fourier transform architecture using twiddle factor based decomposition algorithm

Author: Kumar Bhaarath
Publication venue: Digital Scholarship@UNLV
Publication date: 01/01/2005
Field of study

With the advent of signal processing and wireless communication mobile platform devices, the necessity for data transformation from one form to another becomes an unavoidable aspect. One such mathematical tool that is widely used for transforming time and frequency domain signals is Fourier Transform. Fast Fourier Transform (FFT) is perhaps the fastest way to achieve transformation. Many algorithms and architectures have been designed over the years in an attempt to make FFT algorithms more efficient and to target many applications; The main objective of our work is to design, simulate and implement an architecture based on the Twiddle-Factor-Based decomposition FFT algorithm. The significant feature of the algorithm is its effective memory access reduction that accounts to be as much as 30% lesser than in any other conventional FFT algorithms. As a result of this memory reduction, this algorithm is said to be more power efficient and is said to compute in much lesser number of clock cycles than other algorithms developed; The real focus of the design is to build architecture to map this efficient algorithm on to hardware retaining the maximum efficiency of the algorithm. The complete design, simulation and testing is done using Active-HDL tool which is a VHDL package designed. The architecture designed is found to retain the memory savings capability of the algorithm thus enabling power efficiency

University of Nevada, Las Vegas Repository

Faster Acquisition Technique for Software-defined GPS Receivers

Author: Rao M. Venu Gopala
Ratnam D. Venkata
Publication venue: 'Defence Scientific Information and Documentation Centre'
Publication date: 26/02/2015
Field of study

Acquisition is a most important process and a challenge task for identifying visible satellites, coarse values of carrier frequency, and code phase of the satellite signals in designing software defined Global positioning system (GPS) receiver. This paper presents a new, simple, efficient and faster GPS acquisition via sub-sampled fast Fourier transform (ssFFT). The proposed algorithm exploits the recently developed sparse FFT (or sparse IFFT) that computes in sub-linear time. Further it uses the property of fourier transforms (FT): Aliasing a signal in the time domain corresponds to sub-sampling it in the frequency domain, and vice versa. The ssFFT is an FFT algorithm that computes sub-sampled version of the data by an integer factor ‘d’, and hence, the computational complexity is proportionately reduced by a factor of ‘d log d’ compared to conventional FFT-based algorithms for any length of the input GPS signal. The simulation results show that the proposed ssFFT based GPS acquisition computation is 8.5571 times faster than the conventional FFT-based acquisition computation time. The implementation of this method in an FPGA provides very fast processing of incoming GPS samples that satisfies real-time positioning requirements.Defence Science Journal, Vol. 65, No. 1, January 2015, pp.5-11, DOI:http://dx.doi.org/10.14429/dsj.65.557

Defence Science Journal

Type-IV DCT, DST, and MDCT algorithms with reduced numbers of arithmetic operations

Author: Arai
Arguello
Britanak
Britanak
Britanak
Chan
Chan
Chen
Cheng
Chiang
Crochiere
Duhamel
Duhamel
Duhamel
Fan
Frigo
Gentleman
Gopinath
Hou
Jing
Johnson
Johnson
Kamar
Kok
Krot
Lee
Lee
Lee
Liu
Lundy
Malvar
Malvar
Malvar
Martens
Murthy
Narasimha
Nikolajevic
Painter
Pennebaker
Plonka
Princen
Püschel
Qian
Schatzman
Steven G. Johnson
Suehiro
Takala
Tasche
Vetterli
Wang
Wang
Wang
Xuancheng Shao
Publication venue: 'Elsevier BV'
Publication date: 01/01/2009
Field of study

We present algorithms for the type-IV discrete cosine transform (DCT-IV) and discrete sine transform (DST-IV), as well as for the modified discrete cosine transform (MDCT) and its inverse, that achieve a lower count of real multiplications and additions than previously published algorithms, without sacrificing numerical accuracy. Asymptotically, the operation count is reduced from ~2NlogN to ~(17/9)NlogN for a power-of-two transform size N, and the exact count is strictly lowered for all N > 4. These results are derived by considering the DCT to be a special case of a DFT of length 8N, with certain symmetries, and then pruning redundant operations from a recent improved fast Fourier transform algorithm (based on a recursive rescaling of the conjugate-pair split radix algorithm). The improved algorithms for DST-IV and MDCT follow immediately from the improved count for the DCT-IV.Comment: 11 page

arXiv.org e-Print Archive

CiteSeerX

Crossref

Type-II/III DCT/DST algorithms with reduced number of arithmetic operations

Author: Ahmed
Arai
Arguello
Astola
Chan
Chen
Crochiere
Duhamel
Duhamel
Duhamel
Feig
Frigo
Frigo
Gentleman
Gopinath
Guo
Haralick
Hou
Johnson
Kamar
Kok
Krot
Lee
Lee
Li
Lundy
Makhoul
Malvar
Martens
Narasimha
Pennebaker
Plonka
Press
Püschel
Qian
Rao
Schatzman
Sorensen
Steidl
Steven G. Johnson
Suehiro
Swarztrauber
Takala
Tasche
Tseng
van Loan
Vetterli
Wang
Wang
Xuancheng Shao
Yaroslavskii
Yavne
Publication venue: 'Elsevier BV'
Publication date: 01/01/2009
Field of study

We present algorithms for the discrete cosine transform (DCT) and discrete sine transform (DST), of types II and III, that achieve a lower count of real multiplications and additions than previously published algorithms, without sacrificing numerical accuracy. Asymptotically, the operation count is reduced from ~ 2N log_2 N to ~ (17/9) N log_2 N for a power-of-two transform size N. Furthermore, we show that a further N multiplications may be saved by a certain rescaling of the inputs or outputs, generalizing a well-known technique for N=8 by Arai et al. These results are derived by considering the DCT to be a special case of a DFT of length 4N, with certain symmetries, and then pruning redundant operations from a recent improved fast Fourier transform algorithm (based on a recursive rescaling of the conjugate-pair split radix algorithm). The improved algorithms for DCT-III, DST-II, and DST-III follow immediately from the improved count for the DCT-II.Comment: 9 page

arXiv.org e-Print Archive

CiteSeerX

Crossref

Low Power Implementation of Non Power-of-Two FFTs on Coarse-Grain Reconfigurable Architectures

Author: Quevremont Jérôme
Rivaton Arnaud
Smit Gerard
Wolkotte Pascal
Zhang Qiwei
Publication venue: Technology Foundation, STW
Publication date: 01/01/2005
Field of study

The DRM standard for digital radio broadcast in the AM band requires integrated devices for radio receivers at very low power. A System on Chip (SoC) call DiMITRI was developed based on a dual ARM9 RISC core architecture. Analyses showed that most computation power is used in the Coded Orthogonal Frequency Division Multiplexing (COFDM) demodulation to compute Fast Fourier Transforms (FFT) and inverse transforms (IFFT) on complex samples. These FFTs have to be computed on non power-of-two numbers of samples, which is very uncommon in the signal processing world. The results obtained with this chip, lead to the objective to decrease the power dissipated by the COFDM demodulation part using a coarse-grain reconfigurable structure as a coprocessor. This paper introduces two different coarse-grain architectures: PACT XPP technology and the Montium, developed by the University of Twente, and presents the implementation of a\ud Fast Fourier Transform on 1920 complex samples. The implementation result on the Montium shows a saving of a factor 35 in terms of processing time, and 14 in terms of power consumption compared to the RISC implementation, and a\ud smaller area. Then, as a conclusion, the paper presents the next steps of the development and some development issues

University of Twente Research Information

Generating and Searching Families of FFT Algorithms

Author: Haynal Heidi
Haynal Steve
Publication venue
Publication date: 01/01/2011
Field of study

A fundamental question of longstanding theoretical interest is to prove the lowest exact count of real additions and multiplications required to compute a power-of-two discrete Fourier transform (DFT). For 35 years the split-radix algorithm held the record by requiring just 4n log n - 6n + 8 arithmetic operations on real numbers for a size-n DFT, and was widely believed to be the best possible. Recent work by Van Buskirk et al. demonstrated improvements to the split-radix operation count by using multiplier coefficients or "twiddle factors" that are not n-th roots of unity for a size-n DFT. This paper presents a Boolean Satisfiability-based proof of the lowest operation count for certain classes of DFT algorithms. First, we present a novel way to choose new yet valid twiddle factors for the nodes in flowgraphs generated by common power-of-two fast Fourier transform algorithms, FFTs. With this new technique, we can generate a large family of FFTs realizable by a fixed flowgraph. This solution space of FFTs is cast as a Boolean Satisfiability problem, and a modern Satisfiability Modulo Theory solver is applied to search for FFTs requiring the fewest arithmetic operations. Surprisingly, we find that there are FFTs requiring fewer operations than the split-radix even when all twiddle factors are n-th roots of unity.Comment: Preprint submitted on March 28, 2011, to the Journal on Satisfiability, Boolean Modeling and Computatio

arXiv.org e-Print Archive

CiteSeerX

Comparisons of the execution times and memory requirements for high-speed discrete fourier transforms and fast fourier transforms, for the measurement of AC power harmonics

Author: Burt G. M.
Roscoe A. J.
Publication venue
Publication date: 01/06/2011
Field of study

Conventional wisdom dictates that a Fast Fourier Transform (FFT) will be a more computationally effective method for measuring multiple harmonics than a Discrete Fourier Transform (DFT) approach. However, in this paper it is shown that carefully coded discrete transforms which distribute their computational load over many frames can be made to produce results in shorter execution times than the FFT approach, even for large number of harmonic measurement frequencies. This is because the execution time of the presented DFT actually rises with N and not the classical N2 value, while the execution time of the FFT rises with Nlog2N

University of Strathclyde Institutional Repository