96 research outputs found
Tuning and optimization for a variety of many-core architectures without changing a single line of implementation code using the Alpaka library
We present an analysis on optimizing performance of a single C++11 source
code using the Alpaka hardware abstraction library. For this we use the general
matrix multiplication (GEMM) algorithm in order to show that compilers can
optimize Alpaka code effectively when tuning key parameters of the algorithm.
We do not intend to rival existing, highly optimized DGEMM versions, but merely
choose this example to prove that Alpaka allows for platform-specific tuning
with a single source code. In addition we analyze the optimization potential
available with vendor-specific compilers when confronted with the heavily
templated abstractions of Alpaka. We specifically test the code for bleeding
edge architectures such as Nvidia's Tesla P100, Intel's Knights Landing (KNL)
and Haswell architecture as well as IBM's Power8 system. On some of these we
are able to reach almost 50\% of the peak floating point operation performance
using the aforementioned means. When adding compiler-specific #pragmas we are
able to reach 5 TFLOPS/s on a P100 and over 1 TFLOPS/s on a KNL system.Comment: Accepted paper for the P\^{}3MA workshop at the ISC 2017 in Frankfur
Spectral Control via Multi-Species Effects in PW-Class Laser-Ion Acceleration
Laser-ion acceleration with ultra-short pulse, PW-class lasers is dominated
by non-thermal, intra-pulse plasma dynamics. The presence of multiple ion
species or multiple charge states in targets leads to characteristic
modulations and even mono-energetic features, depending on the choice of target
material. As spectral signatures of generated ion beams are frequently used to
characterize underlying acceleration mechanisms, thermal, multi-fluid
descriptions require a revision for predictive capabilities and control in
next-generation particle beam sources. We present an analytical model with
explicit inter-species interactions, supported by extensive ab initio
simulations. This enables us to derive important ensemble properties from the
spectral distribution resulting from those multi-species effects for arbitrary
mixtures. We further propose a potential experimental implementation with a
novel cryogenic target, delivering jets with variable mixtures of hydrogen and
deuterium. Free from contaminants and without strong influence of hardly
controllable processes such as ionization dynamics, this would allow a
systematic realization of our predictions for the multi-species effect.Comment: 4 pages plus appendix, 11 figures, paper submitted to a journal of
the American Physical Societ
Quantitatively consistent computation of coherent and incoherent radiation in particle-in-cell codes - a general form factor formalism for macro-particles
Quantitative predictions from synthetic radiation diagnostics often have to
consider all accelerated particles. For particle-in-cell (PIC) codes, this not
only means including all macro-particles but also taking into account the
discrete electron distribution associated with them. This paper presents a
general form factor formalism that allows to determine the radiation from this
discrete electron distribution in order to compute the coherent and incoherent
radiation self-consistently. Furthermore, we discuss a memory-efficient
implementation that allows PIC simulations with billions of macro-particles.
The impact on the radiation spectra is demonstrated on a large scale LWFA
simulation.Comment: Proceedings of the EAAC 2017, This manuscript version is made
available under the CC-BY-NC-ND 4.0 licens
On the Scalability of Data Reduction Techniques in Current and Upcoming HPC Systems from an Application Perspective
We implement and benchmark parallel I/O methods for the fully-manycore driven
particle-in-cell code PIConGPU. Identifying throughput and overall I/O size as
a major challenge for applications on today's and future HPC systems, we
present a scaling law characterizing performance bottlenecks in
state-of-the-art approaches for data reduction. Consequently, we propose,
implement and verify multi-threaded data-transformations for the I/O library
ADIOS as a feasible way to trade underutilized host-side compute potential on
heterogeneous systems for reduced I/O latency.Comment: 15 pages, 5 figures, accepted for DRBSD-1 in conjunction with ISC'1
A Laser-Plasma Ion Beam Booster Based on Hollow-Channel Magnetic Vortex Acceleration
Laser-driven ion acceleration can provide ultra-short, high-charge,
low-emittance beams. Although undergoing extensive research, demonstrated
maximum energies for laser-ion sources are non-relativistic, complicating
injection into high- accelerator elements and stopping short of
desirable energies for pivotal applications, such as proton tumor therapy. In
this work, we decouple the efforts towards relativistic beam energies from a
single laser-plasma source via a proof-of-principle concept, boosting the beam
into this regime through only a few plasma stages. We employ full 3D
particle-in-cell simulations to demonstrate the capability for capture of
high-charge beams as produced by laser-driven sources, where both source and
booster stages utilize readily available laser pulse parameters.Comment: 4 pages, 4 figures, submitted for peer revie
Deliverable D4.4 Simulated coherent scattering data from plasma and non–plasma samples
Deliverable D4.4 of work package 4 (SIMEX) in EUCALL
Particle-in-Cell Simulations of Relativistic Magnetic Reconnection with Advanced Maxwell Solver Algorithms
Relativistic magnetic reconnection is a non-ideal plasma process that is a
source of non-thermal particle acceleration in many high-energy astrophysical
systems. Particle-in-cell (PIC) methods are commonly used for simulating
reconnection from first principles. While much progress has been made in
understanding the physics of reconnection, especially in 2D, the adoption of
advanced algorithms and numerical techniques for efficiently modeling such
systems has been limited. With the GPU-accelerated PIC code WarpX, we explore
the accuracy and potential performance benefits of two advanced Maxwell solver
algorithms: a non-standard finite difference scheme (CKC) and an
ultrahigh-order pseudo-spectral method (PSATD). We find that for the
relativistic reconnection problem, CKC and PSATD qualitatively and
quantitatively match the standard Yee-grid finite-difference method. CKC and
PSATD both admit a time step that is 40% longer than Yee, resulting in a ~40%
faster time to solution for CKC, but no performance benefit for PSATD when
using a current deposition scheme that satisfies Gauss's law. Relaxing this
constraint maintains accuracy and yields a 30% speedup. Unlike Yee and CKC,
PSATD is numerically stable at any time step, allowing for a larger time step
than with the finite-difference methods. We found that increasing the time step
2.4-3 times over the standard Yee step still yields accurate results, but only
translates to modest performance improvements over CKC due to the current
deposition scheme used with PSATD. Further optimization of this scheme will
likely improve the effective performance of PSATD.Comment: 19 pages, 10 figures. Submitted to Ap
Efficient laser-driven proton acceleration from cylindrical and planar cryogenic hydrogen jets.
We report on recent experimental results deploying a continuous cryogenic hydrogen jet as a debris-free, renewable laser-driven source of pure proton beams generated at the 150 TW ultrashort pulse laser Draco. Efficient proton acceleration reaching cut-off energies of up to 20 MeV with particle numbers exceeding 109 particles per MeV per steradian is demonstrated, showing for the first time that the acceleration performance is comparable to solid foil targets with thicknesses in the micrometer range. Two different target geometries are presented and their proton beam deliverance characterized: cylindrical (∅ 5 μm) and planar (20 μm × 2 μm). In both cases typical Target Normal Sheath Acceleration emission patterns with exponential proton energy spectra are detected. Significantly higher proton numbers in laser-forward direction are observed when deploying the planar jet as compared to the cylindrical jet case. This is confirmed by two-dimensional Particle-in-Cell (2D3V PIC) simulations, which demonstrate that the planar jet proves favorable as its geometry leads to more optimized acceleration conditions
Exascale and ML Models for Accelerator Simulations
Computational modeling is essential to the exploration and design of advanced particle accelerators. The modeling of laser-plasma acceleration and interaction can achieve predictive quality for experiments if adequate resolution, full geometry and physical effects are included.
Here, we report on the significant evolution in fully relativistic full-3D modeling of conventional and advanced accelerators in the WarpX and ImpactX codes with the introduction of Exascale supercomputing and AI/ML models. We will cover the first PIC simulations on an Exascale machine, the need for and evolution of open standards, and based on our fully open community codes, the connection of time and space scales from plasma to conventional beamlines with data-driven machine-learning models
- …