5,069 research outputs found
Tuning the Performance of a Computational Persistent Homology Package
In recent years, persistent homology has become an attractive method for data analysis. It captures topological features, such as connected components, holes, and voids from point cloud data and summarizes the way in which these features appear and disappear in a filtration sequence. In this project, we focus on improving the performanceof Eirene, a computational package for persistent homology. Eirene is a 5000-line open-source software library implemented in the dynamic programming language Julia. We use the Julia profiling tools to identify performance bottlenecks and develop novel methods to manage them, including the parallelization of some time-consuming functions on multicore/manycore hardware. Empirical results show that performance can be greatly improved
Markovian Monte Carlo program EvolFMC v.2 for solving QCD evolution equations
We present the program EvolFMC v.2 that solves the evolution equations in QCD
for the parton momentum distributions by means of the Monte Carlo technique
based on the Markovian process. The program solves the DGLAP-type evolution as
well as modified-DGLAP ones. In both cases the evolution can be performed in
the LO or NLO approximation. The quarks are treated as massless. The overall
technical precision of the code has been established at 0.05% precision level.
This way, for the first time ever, we demonstrate that with the Monte Carlo
method one can solve the evolution equations with precision comparable to the
other numerical methods.Comment: 38 pages, 9 Postscript figure
PHOTOS Interface in C++; Technical and Physics Documentation
For five years now, PHOTOS Monte Carlo for bremsstrahlung in the decay of
particles and resonances has been available with an interface to the C++ HepMC
event record. The main purpose of the present paper is to document the
technical aspects of the PHOTOS Monte Carlo installation and present version
use. A multitude of test results and examples are distributed together with the
program code.
The PHOTOS C++ physics precision is better than its FORTRAN predecessor and
more convenient steering options are also available. An algorithm for the event
record interface necessary for process dependent photon emission kernel is
implemented. It is used in Z and W decays for kernels of complete first order
matrix elements of the decays. Additional emission of final state lepton pairs
is also available.
Physics assumptions used in the program and properties of the solution are
reviewed. In particular, it is explained how the second order matrix elements
were used in design and validation of the program iteration procedure. Also, it
is explained that the phase space parametrization used in the program is exact.Comment: Updated version; for the program as of April 201
Paraiso : An Automated Tuning Framework for Explicit Solvers of Partial Differential Equations
We propose Paraiso, a domain specific language embedded in functional
programming language Haskell, for automated tuning of explicit solvers of
partial differential equations (PDEs) on GPUs as well as multicore CPUs. In
Paraiso, one can describe PDE solving algorithms succinctly using tensor
equations notation. Hydrodynamic properties, interpolation methods and other
building blocks are described in abstract, modular, re-usable and combinable
forms, which lets us generate versatile solvers from little set of Paraiso
source codes.
We demonstrate Paraiso by implementing a compressive hydrodynamics solver. A
single source code less than 500 lines can be used to generate solvers of
arbitrary dimensions, for both multicore CPUs and GPUs. We demonstrate both
manual annotation based tuning and evolutionary computing based automated
tuning of the program.Comment: 52 pages, 14 figures, accepted for publications in Computational
Science and Discover
A domain-specific language and matrix-free stencil code for investigating electronic properties of Dirac and topological materials
We introduce PVSC-DTM (Parallel Vectorized Stencil Code for Dirac and
Topological Materials), a library and code generator based on a domain-specific
language tailored to implement the specific stencil-like algorithms that can
describe Dirac and topological materials such as graphene and topological
insulators in a matrix-free way. The generated hybrid-parallel (MPI+OpenMP)
code is fully vectorized using Single Instruction Multiple Data (SIMD)
extensions. It is significantly faster than matrix-based approaches on the node
level and performs in accordance with the roofline model. We demonstrate the
chip-level performance and distributed-memory scalability of basic building
blocks such as sparse matrix-(multiple-) vector multiplication on modern
multicore CPUs. As an application example, we use the PVSC-DTM scheme to (i)
explore the scattering of a Dirac wave on an array of gate-defined quantum
dots, to (ii) calculate a bunch of interior eigenvalues for strong topological
insulators, and to (iii) discuss the photoemission spectra of a disordered Weyl
semimetal.Comment: 16 pages, 2 tables, 11 figure
GeantV: Results from the prototype of concurrent vector particle transport simulation in HEP
Full detector simulation was among the largest CPU consumer in all CERN
experiment software stacks for the first two runs of the Large Hadron Collider
(LHC). In the early 2010's, the projections were that simulation demands would
scale linearly with luminosity increase, compensated only partially by an
increase of computing resources. The extension of fast simulation approaches to
more use cases, covering a larger fraction of the simulation budget, is only
part of the solution due to intrinsic precision limitations. The remainder
corresponds to speeding-up the simulation software by several factors, which is
out of reach using simple optimizations on the current code base. In this
context, the GeantV R&D project was launched, aiming to redesign the legacy
particle transport codes in order to make them benefit from fine-grained
parallelism features such as vectorization, but also from increased code and
data locality. This paper presents extensively the results and achievements of
this R&D, as well as the conclusions and lessons learnt from the beta
prototype.Comment: 34 pages, 26 figures, 24 table
- …