422 research outputs found
A GPU-Parallelized Interpolation-Based Fast Multipole Method for the Relativistic Space-Charge Field Calculation
The fast multipole method (FMM) has received growing attention in the beam
physics simulation. In this study, we formulate an interpolation-based FMM for
the computation of the relativistic space-charge field. Different to the
quasi-electrostatic model, our FMM is formulated in the lab-frame and can be
applied without the assistance of the Lorentz transformation. In particular, we
derive a modified admissibility condition which can effectively control the
interpolation error of the proposed FMM. The algorithms and their GPU
parallelization are discussed in detail. A package containing serial and
GPU-parallelized solvers is implemented in the Julia programming language. The
GPU-parallelized solver can reach a speedup of more than a hundred compared to
the execution on a single CPU core.Comment: 30 pages, 10 figure
Quantum Monte Carlo for large chemical systems: Implementing efficient strategies for petascale platforms and beyond
Various strategies to implement efficiently QMC simulations for large
chemical systems are presented. These include: i.) the introduction of an
efficient algorithm to calculate the computationally expensive Slater matrices.
This novel scheme is based on the use of the highly localized character of
atomic Gaussian basis functions (not the molecular orbitals as usually done),
ii.) the possibility of keeping the memory footprint minimal, iii.) the
important enhancement of single-core performance when efficient optimization
tools are employed, and iv.) the definition of a universal, dynamic,
fault-tolerant, and load-balanced computational framework adapted to all kinds
of computational platforms (massively parallel machines, clusters, or
distributed grids). These strategies have been implemented in the QMC=Chem code
developed at Toulouse and illustrated with numerical applications on small
peptides of increasing sizes (158, 434, 1056 and 1731 electrons). Using 10k-80k
computing cores of the Curie machine (GENCI-TGCC-CEA, France) QMC=Chem has been
shown to be capable of running at the petascale level, thus demonstrating that
for this machine a large part of the peak performance can be achieved.
Implementation of large-scale QMC simulations for future exascale platforms
with a comparable level of efficiency is expected to be feasible
- …