10 research outputs found

    First principles gyrokinetic analysis of electromagnetic plasma instabilities

    Full text link
    A two-fold analysis of electromagnetic core tokamak instabilities in the framework of the gyrokinetic theory is presented. First principle theoretical foundations of the gyrokinetic theory are used to explain and justify the numerical results obtained with the global electromagnetic particle-in-cell code Orb5 whose model is derived from the Lagrangian formalism. The energy conservation law corresponding to the Orb5 model is derived from the Noether theorem and implemented in the code as a diagnostics for energy balance and conservation verification. An additional Noether theorem based diagnostics is implemented in order to analyse destabilising mechanisms for the electrostatic and the electromagnetic Ion Temperature Gradient (ITG) instabilities in the core region of the tokamak. The transition towards the Kinetic Ballooning Modes (KBM) at high electromagnetic β\beta is also investigated.Comment: 22 pages, 10 Figures, material form the ICPP conference 2018, invite

    Hybrid OpenMP/MPI parallelization of the charge deposition step in the global gyrokinetic Particle-In-Cell code ORB5

    Get PDF
    Gyrokinetic simulations are computationally extremely demanding due to the high dimensionality of the physical phase space and the interplay between plasma particles and electromagnetic fields. It is thus essential to make full use of the available numerical resources to be able to simulate more complex physical problems. With the aim of optimizing the gyrokinetic Particle-In-Cell code ORB5 towards exascale computing, a particle sorting method is implemented to increase data locality. Furthermore, different algorithms are used to improve vectorization, and the MPI parallelization is complemented with OpenMP. More specifically, we shall focus on the particle to grid operations involved in the PIC charge deposition step. The latter is critical to parallelize using a shared memory paradigm due to the scatter operations involved. We will present the different algorithms and parallelization schemes implemented in the ORB5 charge deposition step and how they affect the speedup compared to the base MPI case

    Porting a Legacy Global Lagrangian PIC Code on Many-Core and GPU-Accelerated Architectures

    Get PDF
    Modern supercomputer architectures are evolving towards embedding more and more cores per compute node, often making use of accelerators such as GPUs, in which thousands of threads can be executed concurrently. To make legacy codes profit efficiently from such resources usually requires a major refactoring effort. I will present the strategy that we adopted for the production code ORB5, a global gyrokinetic Particle-In-Cell (PIC) code for studying turbulence in tokamak plasmas, developed by many physicists over a period of 20 years, which clearly exceeds the timescale of HPC architecture evolution. Among others, the code now includes multiple kinetic species, electromagnetic effects, and collisions. The present refactoring work includes the restructuring of the main kernels, changing the data structure, multithreading with OpenMP on CPUs or OpenACC on GPUs, and optimization on different architectures. The modularity of the resulting code makes it more "future-proof", i.e. extensible to new physics features or computing architectures, and easier to maintain and develop in a collaborative fashion

    Using an antenna as a tool for studying microturbulence and zonal structures in tokamaks with a global gyrokinetic GPU-enabled particle-in-cell code

    No full text
    In order to get a better understanding of the interaction of plasma microinstabilities and associated turbulence with specific modes, an antenna is implemented in the global gyrokinetic Particle-In-Cell (PIC) code ORB5. It consists in applying an external perturbation to the plasma to excite various types of modes and study their coupling with the rest of the system. This antenna can for example be used to apply shear flows on individual microinstabilities or on fully developed turbulence. In other studies, the antenna can also be used to directly excite specific linear eigenmodes. The contributions of the antenna and plasma perturbed fields are considered separately and, optionally, the plasma response can be linearized by neglecting the perturbed plasma field contribution on particle orbits. As a first proof of principle, stationary ExB shear flows are applied to Ion Temperature Gradient (ITG) and Trapped Electron Mode (TEM) -driven instabilities. Their well-known linear stabilizing effect is successfully recovered. In cases where linear eigenmodes have a finite ballooning angle, the possibility of shear flows with a destabilizing effect is shown. Time-dependent shear flows are then applied to quantify their loss of effectiveness with frequency. When going to the nonlinear regime, applied shear flows prove to be unable to mitigate the heat flux level due to opposite zonal flows self-generated by the plasma. A reverse interaction, which is the generation of zonal structures by microinstabilities, is also investigated by exciting non-zonal modes and studying their decay into zonal modes. Future applications include the excitation of Alfvén eigenmodes with an electromagnetic antenna, for which the formalism is derived in this thesis. With a view to conduct the aforementioned studies, the ORB5 code had to be completely refactored. Indeed, the implementation choices of this legacy code were not adapted anymore to make the most of cutting-edge supercomputer performance. Data structures have been re-designed to ensure efficient memory access, enhancing data locality. The MPI parallelization scheme has been complemented by OpenMP multithreading to benefit from the shared memory of many- and multi- core devices. As more and more High Performance Computing (HPC) facilities provide GPU-equipped systems, ORB5 has also been ported to GPUs using OpenACC directives. As a result, the same source code can be run efficiently on different HPC architectures. Performance studies are performed on the Summit machine at ORNL (U.S.A.), Piz Daint at CSCS (Switzerland), and Marconi at CINECA (Italy). ORB5 performance is shown to scale up to the full machine size, which represents more than 24000 GPUs on Summit. The usage of GPUs on Piz Daint brings about a factor 4 speed-up with respect to best CPU-only performance, which is itself about 2 times faster than the pre-refactored version of the code. An alternative to the standard PIC approach is also proposed: the Particle-In-Fourier (PIF). It is shown to be particularly attractive to reduce the cost of simulations involving a low number of Fourier modes, such as linear studies or basic nonlinear mode coupling processes. However, PIC method remains more efficient when the full nonlinear spectrum is involved

    An optimisation of allreduce communication in message-passing systems

    No full text
    Collective communication, namely the pattern allreduce in message-passing systems, is optimised based on measurements at the installation time of the library. The algorithms used are set up in an initialisation phase of the communication, as so-called persistent collective communication, introduced in the message-passing interface (MPI) standard. Part of our allreduce algorithms are the patterns reduce_scatter and allgatherv which are also considered standalone. For the allreduce pattern for short messages the existing cyclic shift algorithm (Bruck’s algorithm) is applied with a prefix operation. For allreduce and long messages our algorithm is based on reduce_scatter and allgatherv, where the cyclic shift algorithm is applied with a flexible number of communication ports per node. The algorithms for equal message sizes are used with non-equal message sizes together with a heuristic for rank reordering. Medium message sizes are communicated with an incomplete reduce_scatter followed by allgatherv. Furthermore, an optional recursive application of the cyclic shift algorithm is applied. All algorithms are applied at the node level. The data is gathered and scattered by the cores within the node and the communication algorithms are applied across the nodes. In general, our approach outperforms the non-persistent counterpart in established MPI libraries by up to one order of magnitude or shows equal performance, with a few exceptions of number of nodes and message sizes.ISSN:0167-8191ISSN:1872-733

    Gyrokinetic simulations on many- and multi-core architectures with the global electromagnetic Particle-In-Cell Code ORB5

    No full text
    Gyrokinetic codes in plasma physics need outstanding computational resources to solve increasingly complex problems, requiring the effective exploitation of cutting-edge HPC architectures. This paper focuses on the enabling of ORB5, a state-of-the-art, first-principles-based gyrokinetic code, on modern parallel hybrid multi-core, multi-GPU systems. ORB5 is a Lagrangian, Particle-In-Cell (PIC), finite element, global, electromagnetic code, originally implementing distributed MPI-based parallelism through domain decomposition and domain cloning. In order to support multi/many cores devices, the code has been completely refactored. Data structures have been re-designed to ensure efficient memory access, enhancing data locality. Multi-threading has been introduced through OpenMP on the CPU and adopting OpenACC to support GPU acceleration. MPI is further used in combination with the two approaches. The performance results obtained using the full production ORB5 code on the Summit system at ORNL, on Piz Daint at CSCS and on the Marconi system at CINECA are presented, showing the effectiveness and performance portability of the adopted solutions: the same source code version was used to produce all results on all architectures

    First principles gyrokinetic analysis of electromagnetic plasma instabilities

    No full text
    International audienceA two-fold analysis of electromagnetic core tokamak instabilities in the framework of the gyrokinetic theory is presented. First principle theoretical foundations of the gyrokinetic theory are used to explain and justify the numerical results obtained with the global electromagnetic particle-in-cell code ORB5 whose model is derived from the Lagrangian formalism. The energy conservation law corresponding to the ORB5 model is derived from the Noether theorem and implemented in the code as a diagnostics for energy balance and conservation verification. Anadditional Noether theorem based diagnostics is implemented in order to analyse destabilising mechanisms for the electrostatic and the electromagnetic ion temperature gradient instabilities in the core region of the tokamak. The transition towards the Kinetic Ballooning Modes at high electromagnetic β is also investigated

    Performance in the IGT.

    No full text
    <p>A. Mean proportion of safe choices for five consecutive blocks of 20 choices each. Results are shown for priming conditions, separately. B. Mean proportion of deck choices (A, B, C, D) for five consecutive blocks of 20 choices each. The proportions are averaged over the three priming conditions. The figure shows that deck A was avoided (irrespective of priming condition, see <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0152297#sec009" target="_blank">result</a> section). In both cases, vertical bars denote one standard error of the mean.</p

    Screenshot during the online BART.

    No full text
    <p>On the balloon, the number of button presses (without the balloon exploding) during the ongoing trial was indicated. The size of the balloon increased with an increasing number of points (i.e. number of mouse clicks on the balloon). Below (“collecter 6 points”), the participant could see the number of points s/he would win if stopping the trial now (clicking on “collecter 6 points”). On the right, the participant could see the column rising; depending on how many points s/he collected across trials. To promote competiveness, participants saw the three best scores of “previous” players on the same column (1ier, 2ième and 3ième). Unknown to participants, these top scores had been set by us and were the same for each participant.</p
    corecore