8 research outputs found

    FFT for the APE Parallel Computer

    Get PDF
    We present a parallel FFT algorithm for SIMD systems following the `Transpose Algorithm' approach. The method is based on the assignment of the data field onto a 1-dimensional ring of systolic cells. The systolic array can be universally mapped onto any parallel system. In particular for systems with next-neighbour connectivity our method has the potential to improve the efficiency of matrix transposition by use of hyper-systolic communication. We have realized a scalable parallel FFT on the APE100/Quadrics massively parallel computer, where our implementation is part of a 2-dimensional hydrodynamics code for turbulence studies. A possible generalization to 4-dimensional FFT is presented, having in mind QCD applications.Comment: 17 pages, 13 figures, figures include

    Codesign Tradeoffs for High-Performance, Low-Power Linear Algebra Architectures

    Full text link

    Hyper-systolic matrix multiplication

    Get PDF
    A novel parallel algorithm for matrix multiplication is presented. It is based on a 1-D hyper-systolic processor abstraction. The procedure can be implemented on all types of parallel systems
    corecore