210 research outputs found
Learning from the Success of MPI
The Message Passing Interface (MPI) has been extremely successful as a
portable way to program high-performance parallel computers. This success has
occurred in spite of the view of many that message passing is difficult and
that other approaches, including automatic parallelization and directive-based
parallelism, are easier to use. This paper argues that MPI has succeeded
because it addresses all of the important issues in providing a parallel
programming model.Comment: 12 pages, 1 figur
CoreTSAR: Task Scheduling for Accelerator-aware Runtimes
Heterogeneous supercomputers that incorporate computational accelerators
such as GPUs are increasingly popular due to their high
peak performance, energy efficiency and comparatively low cost.
Unfortunately, the programming models and frameworks designed
to extract performance from all computational units still lack the
flexibility of their CPU-only counterparts. Accelerated OpenMP
improves this situation by supporting natural migration of OpenMP
code from CPUs to a GPU. However, these implementations currently
lose one of OpenMP’s best features, its flexibility: typical
OpenMP applications can run on any number of CPUs. GPU implementations
do not transparently employ multiple GPUs on a node
or a mix of GPUs and CPUs. To address these shortcomings, we
present CoreTSAR, our runtime library for dynamically scheduling
tasks across heterogeneous resources, and propose straightforward
extensions that incorporate this functionality into Accelerated
OpenMP. We show that our approach can provide nearly linear
speedup to four GPUs over only using CPUs or one GPU while
increasing the overall flexibility of Accelerated OpenMP
Libpsht - algorithms for efficient spherical harmonic transforms
Libpsht (or "library for Performant Spherical Harmonic Transforms") is a
collection of algorithms for efficient conversion between spatial-domain and
spectral-domain representations of data defined on the sphere. The package
supports transforms of scalars as well as spin-1 and spin-2 quantities, and can
be used for a wide range of pixelisations (including HEALPix, GLESP and ECP).
It will take advantage of hardware features like multiple processor cores and
floating-point vector operations, if available. Even without this additional
acceleration, the employed algorithms are among the most efficient (in terms of
CPU time as well as memory consumption) currently being used in the
astronomical community.
The library is written in strictly standard-conforming C90, ensuring
portability to many different hard- and software platforms, and allowing
straightforward integration with codes written in various programming languages
like C, C++, Fortran, Python etc.
Libpsht is distributed under the terms of the GNU General Public License
(GPL) version 2 and can be downloaded from
http://sourceforge.net/projects/libpsht.Comment: 9 pages, 8 figures, accepted by A&
Libsharp - spherical harmonic transforms revisited
We present libsharp, a code library for spherical harmonic transforms (SHTs),
which evolved from the libpsht library, addressing several of its shortcomings,
such as adding MPI support for distributed memory systems and SHTs of fields
with arbitrary spin, but also supporting new developments in CPU instruction
sets like the Advanced Vector Extensions (AVX) or fused multiply-accumulate
(FMA) instructions. The library is implemented in portable C99 and provides an
interface that can be easily accessed from other programming languages such as
C++, Fortran, Python etc. Generally, libsharp's performance is at least on par
with that of its predecessor; however, significant improvements were made to
the algorithms for scalar SHTs, which are roughly twice as fast when using the
same CPU capabilities. The library is available at
http://sourceforge.net/projects/libsharp/ under the terms of the GNU General
Public License
- …