1,167 research outputs found

    An efficient mixed-precision, hybrid CPU-GPU implementation of a fully implicit particle-in-cell algorithm

    Full text link
    Recently, a fully implicit, energy- and charge-conserving particle-in-cell method has been proposed for multi-scale, full-f kinetic simulations [G. Chen, et al., J. Comput. Phys. 230,18 (2011)]. The method employs a Jacobian-free Newton-Krylov (JFNK) solver, capable of using very large timesteps without loss of numerical stability or accuracy. A fundamental feature of the method is the segregation of particle-orbit computations from the field solver, while remaining fully self-consistent. This paper describes a very efficient, mixed-precision hybrid CPU-GPU implementation of the implicit PIC algorithm exploiting this feature. The JFNK solver is kept on the CPU in double precision (DP), while the implicit, charge-conserving, and adaptive particle mover is implemented on a GPU (graphics processing unit) using CUDA in single-precision (SP). Performance-oriented optimizations are introduced with the aid of the roofline model. The implicit particle mover algorithm is shown to achieve up to 400 GOp/s on a Nvidia GeForce GTX580. This corresponds to 25% absolute GPU efficiency against the peak theoretical performance, and is about 300 times faster than an equivalent serial CPU (Intel Xeon X5460) execution. For the test case chosen, the mixed-precision hybrid CPU-GPU solver is shown to over-perform the DP CPU-only serial version by a factor of \sim 100, without apparent loss of robustness or accuracy in a challenging long-timescale ion acoustic wave simulation.Comment: 25 pages, 6 figures, submitted to J. Comput. Phy

    Current-Driven Filament Instabilities in Relativistic Plasmas. Final report

    Full text link

    Performance of a second order electrostatic particle-in-cell algorithm on modern many-core architectures

    Get PDF
    In this paper we present the outline of a novel electrostatic, second order Particle-in-Cell (PIC) algorithm, that makes use of 'ghost particles' located around true particle positions in order to represent a charge distribution. We implement our algorithm within EMPIRE-PIC, a PIC code developed at Sandia National Laboratories. We test the performance of our algorithm on a variety of many-core architectures including NVIDIA GPUs, conventional CPUs, and Intel's Knights Landing. Our preliminary results show the viability of second order methods for PIC applications on these architectures when compared to previous generations of many-core hardware. Specifically, we see an order of magnitude improvement in performance for second order methods between the Tesla K20 and Tesla P100 GPU devices, despite only a 4Ă— improvement in the theoretical peak performance between the devices. Although these initial results show a large increase in runtime over first order methods, we hope to be able to show improved scaling behaviour and increased simulation accuracy in the future

    Apar-T: code, validation, and physical interpretation of particle-in-cell results

    Full text link
    We present the parallel particle-in-cell (PIC) code Apar-T and, more importantly, address the fundamental question of the relations between the PIC model, the Vlasov-Maxwell theory, and real plasmas. First, we present four validation tests: spectra from simulations of thermal plasmas, linear growth rates of the relativistic tearing instability and of the filamentation instability, and non-linear filamentation merging phase. For the filamentation instability we show that the effective growth rates measured on the total energy can differ by more than 50% from the linear cold predictions and from the fastest modes of the simulation. Second, we detail a new method for initial loading of Maxwell-J\"uttner particle distributions with relativistic bulk velocity and relativistic temperature, and explain why the traditional method with individual particle boosting fails. Third, we scrutinize the question of what description of physical plasmas is obtained by PIC models. These models rely on two building blocks: coarse-graining, i.e., grouping of the order of p~10^10 real particles into a single computer superparticle, and field storage on a grid with its subsequent finite superparticle size. We introduce the notion of coarse-graining dependent quantities, i.e., quantities depending on p. They derive from the PIC plasma parameter Lambda^{PIC}, which we show to scale as 1/p. We explore two implications. One is that PIC collision- and fluctuation-induced thermalization times are expected to scale with the number of superparticles per grid cell, and thus to be a factor p~10^10 smaller than in real plasmas. The other is that the level of electric field fluctuations scales as 1/Lambda^{PIC} ~ p. We provide a corresponding exact expression. Fourth, we compare the Vlasov-Maxwell theory, which describes a phase-space fluid with infinite Lambda, to the PIC model and its relatively small Lambda.Comment: 24 pages, 14 figures, accepted in Astronomy & Astrophysic

    Efficient Strict-Binning Particle-in-Cell Algorithm for Multi-Core SIMD Processors

    Get PDF
    International audienceParticle-in-Cell (PIC) codes are widely used for plasma simulations. On recent multi-core hardware, performance of these codes is often limited by memory bandwidth. We describe a multi-core PIC algorithm that achieves close-to-minimal number of memory transfers with the main memory, while at the same time exploiting SIMD instructions for numerical computations and exhibiting a high degree of OpenMP-level parallelism. Our algorithm keeps particles sorted by cell at every time step, and represents particles from a same cell using a linked list of fixed-capacity arrays, called chunks. Chunks support either sequential or atomic insertions, the latter being used to handle fast-moving particles. To validate our code, called Pic-Vert, we consider a 3d electrostatic Landau-damping simulation as well as a 2d3v transverse instability of magnetized electron holes. Performance results on a 24-core Intel Sky-lake hardware confirm the effectiveness of our algorithm, in particular its high throughput and its ability to cope with fast moving particles

    Numerical and Analytical Methods for Laser-Plasma Acceleration Physics

    Get PDF
    Theories and numerical modeling are fundamental tools for understanding, optimizing and designing present and future laser-plasma accelerators (LPAs). Laser evolution and plasma wave excitation in a LPA driven by a weakly relativistically intense, short-pulse laser propagating in a preformed parabolic plasma channel, is studied analytically in 3D including the effects of pulse steepening and energy depletion. At higher laser intensities, the process of electron self-injection in the nonlinear bubble wake regime is studied by means of fully self-consistent Particle-in-Cell simulations. Considering a non-evolving laser driver propagating with a prescribed velocity, the geometrical properties of the non-evolving bubble wake are studied. For a range of parameters of interest for laser plasma acceleration, The dependence of the threshold for self-injection in the non-evolving wake on laser intensity and wake velocity is characterized. Due to the nonlinear and complex nature of the Physics involved, computationally challenging numerical simulations are required to model laser-plasma accelerators operating at relativistic laser intensities. The numerical and computational optimizations, that combined in the codes INF&RNO and INF&RNO/quasi-static give the possibility to accurately model multi-GeV laser wakefield acceleration stages with present supercomputing architectures, are discussed. The PIC code jasmine, capable of efficiently running laser-plasma simulations on Graphics Processing Units (GPUs) clusters, is presented. GPUs deliver exceptional performance to PIC codes, but the core algorithms had to be redesigned for satisfying the constraints imposed by the intrinsic parallelism of the architecture. The simulation campaigns, run with the code jasmine for modeling the recent LPA experiments with the INFN-FLAME and CNR-ILIL laser systems, are also presented

    Characterization of short-pulse laser-produced fast electrons by 3D hybrid particle-in-cell modeling of angularly resolved bremsstrahlung

    Get PDF
    The interaction of an intense short-pulse laser with a solid target efficiently generates energetic (fast) electrons above the energy of 1 Mega-electronvolt (MeV). Characterization of such high-energy electrons is critical for numerous applications, such as the generation of secondary particle sources, the creation of warm dense matter (WDM), advanced fusion concepts, and intense x-ray radiation for probing complex high areal density objects and inertial confinement fusion (ICF) fusion cores. However, determining laser-driven fast electron characteristics, specifically, electron energy distribution, divergence angle, and laser-to-electron conversion efficiency, has been challenging partly due to complex electron trajectories caused by electric sheath potential, known as electron recirculation. This thesis reports on developing a novel fast electron characterization technique by modeling angularly resolved bremsstrahlung radiations with a three-dimensional (3D) hybrid Particle-in-cell (PIC) code. An experiment using a 50-TW Leopard laser (15 J, 0.35 ps, 2Ă—10^19 W/cm2) was carried out to measure bremsstrahlung radiations at two angular positions and escaped fast electrons along the laser axis for two types of targets: a 100-ÎĽm- thick Cu foil and a same Cu target with a CH backing (Cu-CH target). A 3D hybrid-PIC code, Large Scale Plasma (LSP), is extensively used in this work to simulate the electron transport within the solid target, including electron recirculation around the target, and the x-ray generation of absolute photon yields. The measurements were fitted with a series of simulations by varying all three electron parameters. Fitting results based on chi-squared analyses show good agreements for both target types when the electron slope temperature of 0.8 MeV, the divergence angle of 70 degrees, and the electron beam energy of 1.3 J are used. Furthermore, the effects of electron recirculation on bremsstrahlung generation and the enhancement of a short-pulse laser-produced x-ray intensity in various foil thicknesses are numerically studied. These results provide insight into designing and optimizing an x-ray source target for broadband x-ray radiography of a magnetically compressed aluminum rod at the Zebra pulsed power laboratory
    • …
    corecore