226,247 research outputs found
Journal Staff
The fast Fourier transform (FFT) plays an important role in digital signal processing (DSP) applications, and its implementation involves a large number of computations. Many DSP designers have been working on implementations of the FFT algorithms on different devices, such as central processing unit (CPU), Field programmable gate array (FPGA), and graphical processing unit (GPU), in order to accelerate the performance. We selected the GPU device for the implementations of the FFT algorithm because the hardware of GPU is designed with highly parallel structure. It consists of many hundreds of small parallel processing units. The programming of such a parallel device, can be done by a parallel programming language CUDA (Compute Unified Device Architecture). In this thesis, we propose different implementations of the FFT algorithm on the NVIDIA GPU using CUDA programming language. We study and analyze the different approaches, and use different techniques to accelerate the computations of the FFT. We also discuss the results and compare different approaches and techniques. Finally, we compare our best cases of results with the CUFFT library, which is a specific library to compute the FFT on NVIDIA GPUs
Hardware schemes for fast Fourier transform, part 7.4A
Real-time fast fourier transformer (FFT) processing of a MST radar data and cost-effective approaches to hardware FFT generation were studied. Previously devised hardware FFT configurations are described including the estimated number of chips used and the time required to perform a 1024-point FFT. The remaining entries in the table correspond to original designs, which presuppose the availability of a microcomputer and a modestly complicated hardware peripheral. These original designs, all of which implement a radix-4 FFT with twiddle factors, are assigned model numbers to make them easier to refer to
A general purpose subroutine for fast fourier transform on a distributed memory parallel machine
One issue which is central in developing a general purpose Fast Fourier Transform (FFT) subroutine on a distributed memory parallel machine is the data distribution. It is possible that different users would like to use the FFT routine with different data distributions. Thus, there is a need to design FFT schemes on distributed memory parallel machines which can support a variety of data distributions. An FFT implementation on a distributed memory parallel machine which works for a number of data distributions commonly encountered in scientific applications is presented. The problem of rearranging the data after computing the FFT is also addressed. The performance of the implementation on a distributed memory parallel machine Intel iPSC/860 is evaluated
- …
