Fast Fourier Transforms on Distributed Memory Parallel Machines

Abstract

One issue which is central in developing a general purpose subroutine on a distributed memory parallel machine is the data distribution. It is possible that users would like to use the subroutine with different data distributions. Thus there is a need to design algorithms on distributed memory parallel machines which can support a variety of data distributions. In this dissertation we have addressed the problem of developing such algorithms to compute the Discrete Fourier Transform (DFT) of real and complex data. The implementations given in this dissertation work for a class of data distributions commonly encountered in scientific applications, known as the block scattered data distributions. The implementations are targeted at distributed memory parallel machines. We have also addressed the problem of rearranging the data after computing the FFT. For computing the DFT of complex data, we use a standard Radix-2 FFT algorithm which has been studied extensively in parallel environment. There are two ways of computing the DFT of real data that are known to be efficient in serial environments: namely (i) the real fast Fourier transform (RFFT) algorithm, and (ii) the fast Hartley transform (FHT) algorithm. However, in distributed memory environments they have excessive communication overhead. We restructure the RFFT and FHT algorithms to reduce this overhead. The restructured RFFT and FHT algorithms are then used in the generalized implementations which work for block scattered data distributions. Experimental results are given for the restructured RFFT and the FHT algorithms on two parallel machines; NCUBE-7 which is a Hypercube MIMD machine and AMT DAP-510 which is a Mesh SIMD machine. The performances of the FFT, RFFT and FHT algorithms with block scattered data distribution were evaluated on Intel iPSC/860, a Hypercube MIMD machine

    Similar works