96 research outputs found

    Tuning and optimization for a variety of many-core architectures without changing a single line of implementation code using the Alpaka library

    Full text link
    We present an analysis on optimizing performance of a single C++11 source code using the Alpaka hardware abstraction library. For this we use the general matrix multiplication (GEMM) algorithm in order to show that compilers can optimize Alpaka code effectively when tuning key parameters of the algorithm. We do not intend to rival existing, highly optimized DGEMM versions, but merely choose this example to prove that Alpaka allows for platform-specific tuning with a single source code. In addition we analyze the optimization potential available with vendor-specific compilers when confronted with the heavily templated abstractions of Alpaka. We specifically test the code for bleeding edge architectures such as Nvidia's Tesla P100, Intel's Knights Landing (KNL) and Haswell architecture as well as IBM's Power8 system. On some of these we are able to reach almost 50\% of the peak floating point operation performance using the aforementioned means. When adding compiler-specific #pragmas we are able to reach 5 TFLOPS/s on a P100 and over 1 TFLOPS/s on a KNL system.Comment: Accepted paper for the P\^{}3MA workshop at the ISC 2017 in Frankfur

    Spectral Control via Multi-Species Effects in PW-Class Laser-Ion Acceleration

    Get PDF
    Laser-ion acceleration with ultra-short pulse, PW-class lasers is dominated by non-thermal, intra-pulse plasma dynamics. The presence of multiple ion species or multiple charge states in targets leads to characteristic modulations and even mono-energetic features, depending on the choice of target material. As spectral signatures of generated ion beams are frequently used to characterize underlying acceleration mechanisms, thermal, multi-fluid descriptions require a revision for predictive capabilities and control in next-generation particle beam sources. We present an analytical model with explicit inter-species interactions, supported by extensive ab initio simulations. This enables us to derive important ensemble properties from the spectral distribution resulting from those multi-species effects for arbitrary mixtures. We further propose a potential experimental implementation with a novel cryogenic target, delivering jets with variable mixtures of hydrogen and deuterium. Free from contaminants and without strong influence of hardly controllable processes such as ionization dynamics, this would allow a systematic realization of our predictions for the multi-species effect.Comment: 4 pages plus appendix, 11 figures, paper submitted to a journal of the American Physical Societ

    Quantitatively consistent computation of coherent and incoherent radiation in particle-in-cell codes - a general form factor formalism for macro-particles

    Get PDF
    Quantitative predictions from synthetic radiation diagnostics often have to consider all accelerated particles. For particle-in-cell (PIC) codes, this not only means including all macro-particles but also taking into account the discrete electron distribution associated with them. This paper presents a general form factor formalism that allows to determine the radiation from this discrete electron distribution in order to compute the coherent and incoherent radiation self-consistently. Furthermore, we discuss a memory-efficient implementation that allows PIC simulations with billions of macro-particles. The impact on the radiation spectra is demonstrated on a large scale LWFA simulation.Comment: Proceedings of the EAAC 2017, This manuscript version is made available under the CC-BY-NC-ND 4.0 licens

    On the Scalability of Data Reduction Techniques in Current and Upcoming HPC Systems from an Application Perspective

    Full text link
    We implement and benchmark parallel I/O methods for the fully-manycore driven particle-in-cell code PIConGPU. Identifying throughput and overall I/O size as a major challenge for applications on today's and future HPC systems, we present a scaling law characterizing performance bottlenecks in state-of-the-art approaches for data reduction. Consequently, we propose, implement and verify multi-threaded data-transformations for the I/O library ADIOS as a feasible way to trade underutilized host-side compute potential on heterogeneous systems for reduced I/O latency.Comment: 15 pages, 5 figures, accepted for DRBSD-1 in conjunction with ISC'1

    A Laser-Plasma Ion Beam Booster Based on Hollow-Channel Magnetic Vortex Acceleration

    Full text link
    Laser-driven ion acceleration can provide ultra-short, high-charge, low-emittance beams. Although undergoing extensive research, demonstrated maximum energies for laser-ion sources are non-relativistic, complicating injection into high-β\beta accelerator elements and stopping short of desirable energies for pivotal applications, such as proton tumor therapy. In this work, we decouple the efforts towards relativistic beam energies from a single laser-plasma source via a proof-of-principle concept, boosting the beam into this regime through only a few plasma stages. We employ full 3D particle-in-cell simulations to demonstrate the capability for capture of high-charge beams as produced by laser-driven sources, where both source and booster stages utilize readily available laser pulse parameters.Comment: 4 pages, 4 figures, submitted for peer revie

    Particle-in-Cell Simulations of Relativistic Magnetic Reconnection with Advanced Maxwell Solver Algorithms

    Full text link
    Relativistic magnetic reconnection is a non-ideal plasma process that is a source of non-thermal particle acceleration in many high-energy astrophysical systems. Particle-in-cell (PIC) methods are commonly used for simulating reconnection from first principles. While much progress has been made in understanding the physics of reconnection, especially in 2D, the adoption of advanced algorithms and numerical techniques for efficiently modeling such systems has been limited. With the GPU-accelerated PIC code WarpX, we explore the accuracy and potential performance benefits of two advanced Maxwell solver algorithms: a non-standard finite difference scheme (CKC) and an ultrahigh-order pseudo-spectral method (PSATD). We find that for the relativistic reconnection problem, CKC and PSATD qualitatively and quantitatively match the standard Yee-grid finite-difference method. CKC and PSATD both admit a time step that is 40% longer than Yee, resulting in a ~40% faster time to solution for CKC, but no performance benefit for PSATD when using a current deposition scheme that satisfies Gauss's law. Relaxing this constraint maintains accuracy and yields a 30% speedup. Unlike Yee and CKC, PSATD is numerically stable at any time step, allowing for a larger time step than with the finite-difference methods. We found that increasing the time step 2.4-3 times over the standard Yee step still yields accurate results, but only translates to modest performance improvements over CKC due to the current deposition scheme used with PSATD. Further optimization of this scheme will likely improve the effective performance of PSATD.Comment: 19 pages, 10 figures. Submitted to Ap

    Efficient laser-driven proton acceleration from cylindrical and planar cryogenic hydrogen jets.

    Get PDF
    We report on recent experimental results deploying a continuous cryogenic hydrogen jet as a debris-free, renewable laser-driven source of pure proton beams generated at the 150 TW ultrashort pulse laser Draco. Efficient proton acceleration reaching cut-off energies of up to 20 MeV with particle numbers exceeding 109 particles per MeV per steradian is demonstrated, showing for the first time that the acceleration performance is comparable to solid foil targets with thicknesses in the micrometer range. Two different target geometries are presented and their proton beam deliverance characterized: cylindrical (∅ 5 μm) and planar (20 μm × 2 μm). In both cases typical Target Normal Sheath Acceleration emission patterns with exponential proton energy spectra are detected. Significantly higher proton numbers in laser-forward direction are observed when deploying the planar jet as compared to the cylindrical jet case. This is confirmed by two-dimensional Particle-in-Cell (2D3V PIC) simulations, which demonstrate that the planar jet proves favorable as its geometry leads to more optimized acceleration conditions

    Exascale and ML Models for Accelerator Simulations

    Get PDF
    Computational modeling is essential to the exploration and design of advanced particle accelerators. The modeling of laser-plasma acceleration and interaction can achieve predictive quality for experiments if adequate resolution, full geometry and physical effects are included. Here, we report on the significant evolution in fully relativistic full-3D modeling of conventional and advanced accelerators in the WarpX and ImpactX codes with the introduction of Exascale supercomputing and AI/ML models. We will cover the first PIC simulations on an Exascale machine, the need for and evolution of open standards, and based on our fully open community codes, the connection of time and space scales from plasma to conventional beamlines with data-driven machine-learning models
    • …
    corecore