16 research outputs found

    First Evaluation of the CPU, GPGPU and MIC Architectures for Real Time Particle Tracking based on Hough Transform at the LHC

    Full text link
    Recent innovations focused around {\em parallel} processing, either through systems containing multiple processors or processors containing multiple cores, hold great promise for enhancing the performance of the trigger at the LHC and extending its physics program. The flexibility of the CMS/ATLAS trigger system allows for easy integration of computational accelerators, such as NVIDIA's Tesla Graphics Processing Unit (GPU) or Intel's \xphi, in the High Level Trigger. These accelerators have the potential to provide faster or more energy efficient event selection, thus opening up possibilities for new complex triggers that were not previously feasible. At the same time, it is crucial to explore the performance limits achievable on the latest generation multicore CPUs with the use of the best software optimization methods. In this article, a new tracking algorithm based on the Hough transform will be evaluated for the first time on a multi-core Intel Xeon E5-2697v2 CPU, an NVIDIA Tesla K20c GPU, and an Intel \xphi\ 7120 coprocessor. Preliminary time performance will be presented.Comment: 13 pages, 4 figures, Accepted to JINS

    Massively Parallel Computing at the Large Hadron Collider up to the HL-LHC

    Full text link
    As the Large Hadron Collider (LHC) continues its upward progression in energy and luminosity towards the planned High-Luminosity LHC (HL-LHC) in 2025, the challenges of the experiments in processing increasingly complex events will also continue to increase. Improvements in computing technologies and algorithms will be a key part of the advances necessary to meet this challenge. Parallel computing techniques, especially those using massively parallel computing (MPC), promise to be a significant part of this effort. In these proceedings, we discuss these algorithms in the specific context of a particularly important problem: the reconstruction of charged particle tracks in the trigger algorithms in an experiment, in which high computing performance is critical for executing the track reconstruction in the available time. We discuss some areas where parallel computing has already shown benefits to the LHC experiments, and also demonstrate how a MPC-based trigger at the CMS experiment could not only improve performance, but also extend the reach of the CMS trigger system to capture events which are currently not practical to reconstruct at the trigger level.Comment: 14 pages, 6 figures. Proceedings of 2nd International Summer School on Intelligent Signal Processing for Frontier Research and Industry (INFIERI2014), to appear in JINST. Revised version in response to referee comment

    Massively Parallel Computing and the Search for Jets and Black Holes at the LHC

    Full text link
    Massively parallel computing at the LHC could be the next leap necessary to reach an era of new discoveries at the LHC after the Higgs discovery. Scientific computing is a critical component of the LHC experiment, including operation, trigger, LHC computing GRID, simulation, and analysis. One way to improve the physics reach of the LHC is to take advantage of the flexibility of the trigger system by integrating coprocessors based on Graphics Processing Units (GPUs) or the Many Integrated Core (MIC) architecture into its server farm. This cutting edge technology provides not only the means to accelerate existing algorithms, but also the opportunity to develop new algorithms that select events in the trigger that previously would have evaded detection. In this article we describe new algorithms that would allow to select in the trigger new topological signatures that include non-prompt jet and black hole--like objects in the silicon tracker.Comment: 15 pages, 11 figures, submitted to NIM

    Fast algorithm for real-time rings reconstruction

    Get PDF
    The GAP project is dedicated to study the application of GPU in several contexts in which real-time response is important to take decisions. The definition of real-time depends on the application under study, ranging from answer time of ÎĽs up to several hours in case of very computing intensive task. During this conference we presented our work in low level triggers [1] [2] and high level triggers [3] in high energy physics experiments, and specific application for nuclear magnetic resonance (NMR) [4] [5] and cone-beam CT [6]. Apart from the study of dedicated solution to decrease the latency due to data transport and preparation, the computing algorithms play an essential role in any GPU application. In this contribution, we show an original algorithm developed for triggers application, to accelerate the ring reconstruction in RICH detector when it is not possible to have seeds for reconstruction from external trackers

    Simulated Hough Transform Model Optimized for Straight-Line Recognition Using Frontier FPGA Devices

    Get PDF
    The use of the Hough transforms to identify shapes or images has been extensively studied in the past using software for artificial intelligence applications. In this article, we present a generalization of the goal of shape recognition using the Hough transform, applied to a broader range of real problems. A software simulator was developed to generate input patterns (straight-lines) and test the ability of a generic low-latency system to identify these lines: first in a clean environment with no other inputs and then looking for the same lines as ambient background noise increases. In particular, the paper presents a study to optimize the implementation of the Hough transform algorithm in programmable digital devices, such as FPGAs. We investigated the ability of the Hough transform to discriminate straight-lines within a vast bundle of random lines, emulating a noisy environment. In more detail, the study follows an extensive investigation we recently conducted to recognize tracks of ionizing particles in high-energy physics. In this field, the lines can represent the trajectories of particles that must be immediately recognized as they are created in a particle detector. The main advantage of using FPGAs over any other component is their speed and low latency to investigate pattern recognition problems in a noisy environment. In fact, FPGAs guarantee a latency that increases linearly with the incoming data, while other solutions increase latency times more quickly. Furthermore, HT inherently adapts to incomplete input data sets, especially if resolutions are limited. Hence, an FPGA system that implements HT is inefficient for small sets of input data but becomes more cost-effective as the size of the input data increases. The document first presents an example that uses a large Accumulator consisting of 1100 x 600 Bins and several sets of input data to validate the Hough transform algorithm as random noise increases to 80% of input data. Then, a more specifically dedicated input set was chosen to emulate a real situation where a Xilinx UltraScale+ was to be used as the final target device. Thus, we have reduced the Accumulator to 280 x  280 Bins using a clock signal at 250 MHz and a few tens input points. Under these conditions, the behavior of the firmware matched the software simulations, confirming the feasibility of the HT implementation on FPGA

    Parallel Triplet Finding for Particle Track Reconstruction. [Mit einer ausfĂĽhrlichen deutschen Zusammenfassung]

    Get PDF

    Simulating Nonlinear Neutrino Oscillations on Next-Generation Many-Core Architectures

    Get PDF
    In this work an astrophysical simulation code, XFLAT, is developed to study neutrino oscillations in supernovae. XFLAT is a hybrid modular code which was designed to utilize multiple levels of parallelism through MPI, OpenMP, and SIMD instructions (vectorization). It can run on both the CPU and the Xeon Phi co-processor, the latter of which is based on the Intel Many Integrated Core Architecture (MIC). The performance of XFLAT on various system configurations and physics scenarios has been analyzed. In addition, the impact of I/O and the multi-node configuration on the Xeon Phi-equipped heterogeneous supercomputers such as Stampede at the Texas Advanced Computing Center (TACC) was investigated
    corecore