Search CORE

12,879 research outputs found

Fast Fourier Transforms for Finite Inverse Semigroups

Author: Malandro Martin
Publication venue: 'Elsevier BV'
Publication date: 27/11/2009
Field of study

We extend the theory of fast Fourier transforms on finite groups to finite inverse semigroups. We use a general method for constructing the irreducible representations of a finite inverse semigroup to reduce the problem of computing its Fourier transform to the problems of computing Fourier transforms on its maximal subgroups and a fast zeta transform on its poset structure. We then exhibit explicit fast algorithms for particular inverse semigroups of interest--specifically, for the rook monoid and its wreath products by arbitrary finite groups.Comment: ver 3: Added improved upper and lower bounds for the memory required by the fast zeta transform on the rook monoid. ver 2: Corrected typos and (naive) bounds on memory requirements. 30 pages, 0 figure

arXiv.org e-Print Archive

Elsevier - Publisher Connector

Fast Fourier Transforms for the Rook Monoid

Author: Malandro Martin
Rockmore Daniel N.
Publication venue: 'American Mathematical Society (AMS)'
Publication date: 26/09/2007
Field of study

We define the notion of the Fourier transform for the rook monoid (also called the symmetric inverse semigroup) and provide two efficient divide-and-conquer algorithms (fast Fourier transforms, or FFTs) for computing it. This paper marks the first extension of group FFTs to non-group semigroups

arXiv.org e-Print Archive

Crossref

Recommended from our members

Fast Fourier Transforms: A Review

Author: Wolberg George
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/1988
Field of study

The purpose of this paper is to provide a detailed review of the Fast Fourier Transform. Some familiarity with the basic concepts of the Fourier Transform is assumed. The review begins with a definition of the discrete Fourier Transform (DFT) in section 1. Directly evaluating the DFT is demonstrated there to be an 0 (N2) process. The efficient approach for evaluating the OFT is through the use of FFT algorithms. Their existence became generally known in the mid-1960s, stemming from the work of J. W. Cooley and J. W. Tukey. Although they pioneered new FFT algorithms, the original work was actually discovered over 20 years earlier by Danielson and Lanczos. Their formulation, known as the Danielson-Lanczos Lemma, is derived in section 2. Their recursive solution is shown to reduce the computational complexity to 0 (N log2 N). A modification of that method, the Cooley-Tukey algorithm, is given in section 3. Yet another variation, the Cooley-Sande algorithm, is described in section 4. These last two techniques are also known in the literature as the decimation-in-time and decimation-in-frequency algorithms. respectively. Finally, source code, written in C, is provided in the appendix

Columbia University Academic Commons

Wafer-Scale Fast Fourier Transforms

Author: Chetlur Sharan
Jacquelin Mathias
Orenes-Vera Marcelo
Schreiber Robert
Sharapov Ilya
Vandermersch Philippe
Publication venue
Publication date: 29/09/2022
Field of study

We have implemented fast Fourier transforms for one, two, and three-dimensional arrays on the Cerebras CS-2, a system whose memory and processing elements reside on a single silicon wafer. The wafer-scale engine (WSE) encompasses a two-dimensional mesh of roughly 850,000 processing elements (PEs) with fast local memory and equally fast nearest-neighbor interconnections. Our wafer-scale FFT (wsFFT) parallelizes a

n^3

problem with up to

n^2

PEs. At this point a PE processes only a single vector of the 3D domain (known as a pencil) per superstep, where each of the three supersteps performs FFT along one of the three axes of the input array. Between supersteps, wsFFT redistributes (transposes) the data to bring all elements of each one-dimensional pencil being transformed into the memory of a single PE. Each redistribution causes an all-to-all communication along one of the mesh dimensions. Given the level of parallelism, the size of the messages transmitted between pairs of PEs can be as small as a single word. In theory, a mesh is not ideal for all-to-all communication due to its limited bisection bandwidth. However, the mesh interconnecting PEs on the WSE lies entirely on-wafer and achieves nearly peak bandwidth even with tiny messages. This high efficiency on fine-grain communication allow wsFFT to achieve unprecedented levels of parallelism and performance. We analyse in detail computation and communication time, as well as the weak and strong scaling, using both FP16 and FP32 precision. With 32-bit arithmetic on the CS-2, we achieve 959 microseconds for 3D FFT of a

512^3

complex input array using a 512x512 subgrid of the on-wafer PEs. This is the largest ever parallelization for this problem size and the first implementation that breaks the millisecond barrier

arXiv.org e-Print Archive

Fast Quantum Fourier Transforms for a Class of Non-abelian Groups

Author: Beth Thomas
Pueschel Markus
Roetteler Martin
Publication venue
Publication date: 22/07/1998
Field of study

An algorithm is presented allowing the construction of fast Fourier transforms for any solvable group on a classical computer. The special structure of the recursion formula being the core of this algorithm makes it a good starting point to obtain systematically fast Fourier transforms for solvable groups on a quantum computer. The inherent structure of the Hilbert space imposed by the qubit architecture suggests to consider groups of order 2^n first (where n is the number of qubits). As an example, fast quantum Fourier transforms for all 4 classes of non-abelian 2-groups with cyclic normal subgroup of index 2 are explicitly constructed in terms of quantum circuits. The (quantum) complexity of the Fourier transform for these groups of size 2^n is O(n^2) in all cases.Comment: 16 pages, LaTeX2

arXiv.org e-Print Archive

CiteSeerX

Some applications of fast Fourier transforms

Author: Kizilkaya Mustafa
Publication venue: Lehigh Preserve
Publication date
Field of study

Lehigh University: Lehigh Preserve

Ordered fast fourier transforms on a massively parallel hypercube multiprocessor

Author: Swarztrauber Paul N.
Tong Charles
Publication venue
Publication date
Field of study

Design alternatives for ordered Fast Fourier Transformation (FFT) algorithms were examined on massively parallel hypercube multiprocessors such as the Connection Machine. Particular emphasis is placed on reducing communication which is known to dominate the overall computing time. To this end, the order and computational phases of the FFT were combined, and the sequence to processor maps that reduce communication were used. The class of ordered transforms is expanded to include any FFT in which the order of the transform is the same as that of the input sequence. Two such orderings are examined, namely, standard-order and A-order which can be implemented with equal ease on the Connection Machine where orderings are determined by geometries and priorities. If the sequence has N = 2 exp r elements and the hypercube has P = 2 exp d processors, then a standard-order FFT can be implemented with d + r/2 + 1 parallel transmissions. An A-order sequence can be transformed with 2d - r/2 parallel transmissions which is r - d + 1 fewer than the standard order. A parallel method for computing the trigonometric coefficients is presented that does not use trigonometric functions or interprocessor communication. A performance of 0.9 GFLOPS was obtained for an A-order transform on the Connection Machine

NASA Technical Reports Server

Fast computation of magnetostatic fields by Non-uniform Fast Fourier Transforms

Author: Braess D.
Evaggelos Kritsikis
Helga Szambolics
Jean-Christophe Toussaint
Liliana Buda-Prejbeanu
Olivier Fruchart
Publication venue: 'AIP Publishing'
Publication date: 15/07/2008
Field of study

The bottleneck of micromagnetic simulations is the computation of the long-ranged magnetostatic fields. This can be tackled on regular N-node grids with Fast Fourier Transforms in time N logN, whereas the geometrically more versatile finite element methods (FEM) are bounded to N^4/3 in the best case. We report the implementation of a Non-uniform Fast Fourier Transform algorithm which brings a N logN convergence to FEM, with no loss of accuracy in the results

arXiv.org e-Print Archive

Crossref

Hal - Université Grenoble Alpes

HAL-CEA