422 research outputs found

    A GPU-Parallelized Interpolation-Based Fast Multipole Method for the Relativistic Space-Charge Field Calculation

    Full text link
    The fast multipole method (FMM) has received growing attention in the beam physics simulation. In this study, we formulate an interpolation-based FMM for the computation of the relativistic space-charge field. Different to the quasi-electrostatic model, our FMM is formulated in the lab-frame and can be applied without the assistance of the Lorentz transformation. In particular, we derive a modified admissibility condition which can effectively control the interpolation error of the proposed FMM. The algorithms and their GPU parallelization are discussed in detail. A package containing serial and GPU-parallelized solvers is implemented in the Julia programming language. The GPU-parallelized solver can reach a speedup of more than a hundred compared to the execution on a single CPU core.Comment: 30 pages, 10 figure

    Quantum Monte Carlo for large chemical systems: Implementing efficient strategies for petascale platforms and beyond

    Full text link
    Various strategies to implement efficiently QMC simulations for large chemical systems are presented. These include: i.) the introduction of an efficient algorithm to calculate the computationally expensive Slater matrices. This novel scheme is based on the use of the highly localized character of atomic Gaussian basis functions (not the molecular orbitals as usually done), ii.) the possibility of keeping the memory footprint minimal, iii.) the important enhancement of single-core performance when efficient optimization tools are employed, and iv.) the definition of a universal, dynamic, fault-tolerant, and load-balanced computational framework adapted to all kinds of computational platforms (massively parallel machines, clusters, or distributed grids). These strategies have been implemented in the QMC=Chem code developed at Toulouse and illustrated with numerical applications on small peptides of increasing sizes (158, 434, 1056 and 1731 electrons). Using 10k-80k computing cores of the Curie machine (GENCI-TGCC-CEA, France) QMC=Chem has been shown to be capable of running at the petascale level, thus demonstrating that for this machine a large part of the peak performance can be achieved. Implementation of large-scale QMC simulations for future exascale platforms with a comparable level of efficiency is expected to be feasible
    • …
    corecore