Search CORE

8 research outputs found

On the efficient parallel computation of Legendre transforms

Author: Bisseling R.H.
Inda M.A.
Maslen D.K.
Publication venue
Publication date: 01/01/2001
Field of study

In this article, we discuss a parallel implementation of efficient algorithms for computation of Legendre polynomial transforms and other orthogonal polynomial transforms. We develop an approach to the Driscoll-Healy algorithm using polynomial arithmetic and present experimental results on the accuracy, efficiency, and scalability of our implementation. The algorithms were implemented in ANSI C using the BSPlib communications library. We also present a new algorithm for computing the cosine transform of two vectors at the same time

Utrecht University Repository

Parallel Spherical Harmonic Transforms on heterogeneous architectures (GPUs/multi-core CPUs)

Author: Esterie Pierre
Falcou Joel
Grigori Laura
Stompor R.
Szydlarski Mikolaj
Publication venue
Publication date: 15/05/2012
Field of study

Spherical Harmonic Transforms (SHT) are at the heart of many scientific and practical applications ranging from climate modelling to cosmological observations. In many of these areas new, cutting-edge science goals have been recently proposed requiring simulations and analyses of experimental or observational data at very high resolutions and of unprecedented volumes. Both these aspects pose formidable challenge for the currently existing implementations of the transforms. This paper describes parallel algorithms for computing SHT with two variants of intra-node parallelism appropriate for novel supercomputer architectures, multi-core processors and Graphic Processing Units (GPU). It also discusses their performance, alone and embedded within a top-level, MPI-based parallelisation layer ported from the S2HAT library, in terms of their accuracy, overall efficiency and scalability. We show that our inverse SHT run on GeForce 400 Series GPUs equipped with latest CUDA architecture ("Fermi") outperforms the state of the art implementation for a multi-core processor executed on a current Intel Core i7-2600K. Furthermore, we show that an MPI/CUDA version of the inverse transform run on a cluster of 128 Nvidia Tesla S1070 is as much as 3 times faster than the hybrid MPI/OpenMP version executed on the same number of quad-core processors Intel Nahalem for problem sizes motivated by our target applications. Performance of the direct transforms is however found to be at the best comparable in these cases. We discuss in detail the algorithmic solutions devised for major steps involved in the transforms calculation, emphasising those with a major impact on their overall performance, and elucidates the sources of the dichotomy between the direct and the inverse operations

arXiv.org e-Print Archive

HAL-CentraleSupelec

HAL - Lille 3

HAL-IN2P3

INRIA a CCSD electronic archive server

On the Efficient Parallel Computation of Legendre Transforms

Author: David K. Maslen
Márcia A. Inda
Rob H. Bisseling
Publication venue
Publication date: 01/01/1999
Field of study

CiteSeerX

Utrecht University Repository

On the efficient parallel computation of Legendre transforms

Author: Bisseling R.H.
Inda M.A.
Maslen D.K.
Publication venue
Publication date: 01/01/1999
Field of study

In this article we discuss a parallel implementation of efficient algorithms for computation of Legendre polynomial transforms and other orthogonal polynomial transforms. We develop an approach to the Driscoll-Healy algorithm using polynomial arithmetic and present experimental results on the accuracy efficiency and scalability of our implementation. The algorithms were implemented in ANSI C using the BSPlib communications library. We also present a new algorithm for computing the cosine transform of two vectors at the same tim

On The Efficient Parallel Computation Of Legendre Transforms

Author: A. Inda
David
David K. Maslen
M Arcia
Márcia A. Inda
Rob H. Bisseling
Publication venue
Publication date
Field of study

CiteSeerX