2,028 research outputs found
Reconstruction for Liquid Argon TPC Neutrino Detectors Using Parallel Architectures
Neutrinos are particles that interact rarely, so identifying them requires
large detectors which produce lots of data. Processing this data with the
computing power available is becoming more difficult as the detectors increase
in size to reach their physics goals. In liquid argon time projection chambers
(TPCs) the charged particles from neutrino interactions produce ionization
electrons which drift in an electric field towards a series of collection
wires, and the signal on the wires is used to reconstruct the interaction. The
MicroBooNE detector currently collecting data at Fermilab has 8000 wires, and
planned future experiments like DUNE will have 100 times more, which means that
the time required to reconstruct an event will scale accordingly. Modernization
of liquid argon TPC reconstruction code, including vectorization,
parallelization and code portability to GPUs, will help to mitigate these
challenges. The liquid argon TPC hit finding algorithm within the
\texttt{LArSoft}\xspace framework used across multiple experiments has been
vectorized and parallelized. This increases the speed of the algorithm on the
order of ten times within a standalone version on Intel architectures. This new
version has been incorporated back into \texttt{LArSoft}\xspace so that it can
be generally used. These methods will also be applied to other low-level
reconstruction algorithms of the wire signals such as the deconvolution. The
applications and performance of this modernized liquid argon TPC wire
reconstruction will be presented
Parallelized and Vectorized Tracking Using Kalman Filters with CMS Detector Geometry and Events
The High-Luminosity Large Hadron Collider at CERN will be characterized by
greater pileup of events and higher occupancy, making the track reconstruction
even more computationally demanding. Existing algorithms at the LHC are based
on Kalman filter techniques with proven excellent physics performance under a
variety of conditions. Starting in 2014, we have been developing
Kalman-filter-based methods for track finding and fitting adapted for many-core
SIMD processors that are becoming dominant in high-performance systems.
This paper summarizes the latest extensions to our software that allow it to
run on the realistic CMS-2017 tracker geometry using CMSSW-generated events,
including pileup. The reconstructed tracks can be validated against either the
CMSSW simulation that generated the hits, or the CMSSW reconstruction of the
tracks. In general, the code's computational performance has continued to
improve while the above capabilities were being added. We demonstrate that the
present Kalman filter implementation is able to reconstruct events with
comparable physics performance to CMSSW, while providing generally better
computational performance. Further plans for advancing the software are
discussed
Generalizing mkFit and its Application to HL-LHC
mkFit is an implementation of the Kalman filter-based track reconstruction
algorithm that exploits both thread- and data-level parallelism. In the past
few years the project transitioned from the R&D phase to deployment in the
Run-3 offline workflow of the CMS experiment. The CMS tracking performs a
series of iterations, targeting reconstruction of tracks of increasing
difficulty after removing hits associated to tracks found in previous
iterations. mkFit has been adopted for several of the tracking iterations,
which contribute to the majority of reconstructed tracks. When tested in the
standard conditions for production jobs, speedups in track pattern recognition
are on average of the order of 3.5x for the iterations where it is used (3-7x
depending on the iteration).
Multiple factors contribute to the observed speedups, including vectorization
and a lightweight geometry description, as well as improved memory management
and single precision. Efficient vectorization is achieved with both the icc and
the gcc (default in CMSSW) compilers and relies on a dedicated library for
small matrix operations, Matriplex, which has recently been released in a
public repository. While the mkFit geometry description already featured levels
of abstraction from the actual Phase-1 CMS tracker, several components of the
implementations were still tied to that specific geometry. We have further
generalized the geometry description and the configuration of the run-time
parameters, in order to enable support for the Phase-2 upgraded tracker
geometry for the HL-LHC and potentially other detector configurations. The
implementation strategy and high-level code changes required for the HL-LHC
geometry are presented. Speedups in track building from mkFit imply that track
fitting becomes a comparably time consuming step of the tracking chain
Reconstruction of Charged Particle Tracks in Realistic Detector Geometry Using a Vectorized and Parallelized Kalman Filter Algorithm
One of the most computationally challenging problems expected for the
High-Luminosity Large Hadron Collider (HL-LHC) is finding and fitting particle
tracks during event reconstruction. Algorithms used at the LHC today rely on
Kalman filtering, which builds physical trajectories incrementally while
incorporating material effects and error estimation. Recognizing the need for
faster computational throughput, we have adapted Kalman-filter-based methods
for highly parallel, many-core SIMD and SIMT architectures that are now
prevalent in high-performance hardware. Previously we observed significant
parallel speedups, with physics performance comparable to CMS standard
tracking, on Intel Xeon, Intel Xeon Phi, and (to a limited extent) NVIDIA GPUs.
While early tests were based on artificial events occurring inside an idealized
barrel detector, we showed subsequently that our mkFit software builds tracks
successfully from complex simulated events (including detector pileup)
occurring inside a geometrically accurate representation of the CMS-2017
tracker. Here, we report on advances in both the computational and physics
performance of mkFit, as well as progress toward integration with CMS
production software. Recently we have improved the overall efficiency of the
algorithm by preserving short track candidates at a relatively early stage
rather than attempting to extend them over many layers. Moreover, mkFit
formerly produced an excess of duplicate tracks; these are now explicitly
removed in an additional processing step. We demonstrate that with these
enhancements, mkFit becomes a suitable choice for the first iteration of CMS
tracking, and eventually for later iterations as well. We plan to test this
capability in the CMS High Level Trigger during Run 3 of the LHC, with an
ultimate goal of using it in both the CMS HLT and offline reconstruction for
the HL-LHC CMS tracker
Generalizing mkFit and its Application to HL-LHC
mkFit is an implementation of the Kalman filter-based track reconstruction algorithm that exploits both threadand data-level parallelism. In the past few years the project transitioned from the R&D phase to deployment in the Run-3 offline workflow of the CMS experiment. The CMS tracking performs a series of iterations, targeting reconstruction of tracks of increasing difficulty after removing hits associated to tracks found in previous iterations. mkFit has been adopted for several of the tracking iterations, which contribute to the majority of reconstructed tracks. When tested in the standard conditions for production jobs, speedups in track pattern recognition are on average of the order of 3.5x for the iterations where it is used (3-7x depending on the iteration). Multiple factors contribute to the observed speedups, including vectorization and a lightweight geometry description, as well as improved memory management and single precision. Efficient vectorization is achieved with both the icc and the gcc (default in CMSSW) compilers and relies on a dedicated library for small matrix operations, Matriplex, which has recently been released in a public repository. While the mkFit geometry description already featured levels of abstraction from the actual Phase-1 CMS tracker, several components of the implementations were still tied to that specific geometry. We have further generalized the geometry description and the configuration of the run-time parameters, in order to enable support for the Phase-2 upgraded tracker geometry for the HL-LHC and potentially other detector configurations. The implementation strategy and high-level code changes required for the HL-LHC geometry are presented. Speedups in track building from mkFit imply that track fitting becomes a comparably time consuming step of the tracking chain. Prospects for an mkFit implementation of the track fit are also discussed
Application of performance portability solutions for GPUs and many-core CPUs to track reconstruction kernels
Next generation High-Energy Physics (HEP) experiments are presented with significant computational challenges, both in terms of data volume and processing power. Using compute accelerators, such as GPUs, is one of the promising ways to provide the necessary computational power to meet the challenge. The current programming models for compute accelerators often involve using architecture-specific programming languages promoted by the hardware vendors and hence limit the set of platforms that the code can run on. Developing software with platform restrictions is especially unfeasible for HEP communities as it takes significant effort to convert typical HEP algorithms into ones that are efficient for compute accelerators. Multiple performance portability solutions have recently emerged and provide an alternative path for using compute accelerators, which allow the code to be executed on hardware from different vendors. We apply several portability solutions, such as Kokkos, SYCL, C++17 std::execution::par, Alpaka, and OpenMP/OpenACC, on two mini-apps extracted from the mkFit project: p2z and p2r. These apps include basic kernels for a Kalman filter track fit, such as propagation and update of track parameters, for detectors at a fixed z or fixed r position, respectively. The two mini-apps explore different memory layout formats.
We report on the development experience with different portability solutions, as well as their performance on GPUs and many-core CPUs, measured as the throughput of the kernels from different GPU and CPU vendors such as NVIDIA, AMD and Intel
Exploring code portability solutions for HEP with a particle tracking test code
Traditionally, high energy physics (HEP) experiments have relied on x86 CPUs for the majority of their significant computing needs. As the field looks ahead to the next generation of experiments such as DUNE and the High-Luminosity LHC, the computing demands are expected to increase dramatically. To cope with this increase, it will be necessary to take advantage of all available computing resources, including GPUs from different vendors. A broad landscape of code portability tools—including compiler pragma-based approaches, abstraction libraries, and other tools—allow the same source code to run efficiently on multiple architectures. In this paper, we use a test code taken from a HEP tracking algorithm to compare the performance and experience of implementing different portability solutions. While in several cases portable implementations perform close to the reference code version, we find that the performance varies significantly depending on the details of the implementation. Achieving optimal performance is not easy, even for relatively simple applications such as the test codes considered in this work. Several factors can affect the performance, such as the choice of the memory layout, the memory pinning strategy, and the compiler used. The compilers and tools are being actively developed, so future developments may be critical for their deployment in HEP experiments
- …