910 research outputs found

    Computational Particle Physics for Event Generators and Data Analysis

    Full text link
    High-energy physics data analysis relies heavily on the comparison between experimental and simulated data as stressed lately by the Higgs search at LHC and the recent identification of a Higgs-like new boson. The first link in the full simulation chain is the event generation both for background and for expected signals. Nowadays event generators are based on the automatic computation of matrix element or amplitude for each process of interest. Moreover, recent analysis techniques based on the matrix element likelihood method assign probabilities for every event to belong to any of a given set of possible processes. This method originally used for the top mass measurement, although computing intensive, has shown its power at LHC to extract the new boson signal from the background. Serving both needs, the automatic calculation of matrix element is therefore more than ever of prime importance for particle physics. Initiated in the eighties, the techniques have matured for the lowest order calculations (tree-level), but become complex and CPU time consuming when higher order calculations involving loop diagrams are necessary like for QCD processes at LHC. New calculation techniques for next-to-leading order (NLO) have surfaced making possible the generation of processes with many final state particles (up to 6). If NLO calculations are in many cases under control, although not yet fully automatic, even higher precision calculations involving processes at 2-loops or more remain a big challenge. After a short introduction to particle physics and to the related theoretical framework, we will review some of the computing techniques that have been developed to make these calculations automatic. The main available packages and some of the most important applications for simulation and data analysis, in particular at LHC will also be summarized.Comment: 19 pages, 11 figures, Proceedings of CCP (Conference on Computational Physics) Oct. 2012, Osaka (Japan) in IOP Journal of Physics: Conference Serie

    Computational Physics on Graphics Processing Units

    Full text link
    The use of graphics processing units for scientific computations is an emerging strategy that can significantly speed up various different algorithms. In this review, we discuss advances made in the field of computational physics, focusing on classical molecular dynamics, and on quantum simulations for electronic structure calculations using the density functional theory, wave function techniques, and quantum field theory.Comment: Proceedings of the 11th International Conference, PARA 2012, Helsinki, Finland, June 10-13, 201

    The Parallelism Motifs of Genomic Data Analysis

    Get PDF
    Genomic data sets are growing dramatically as the cost of sequencing continues to decline and small sequencing devices become available. Enormous community databases store and share this data with the research community, but some of these genomic data analysis problems require large scale computational platforms to meet both the memory and computational requirements. These applications differ from scientific simulations that dominate the workload on high end parallel systems today and place different requirements on programming support, software libraries, and parallel architectural design. For example, they involve irregular communication patterns such as asynchronous updates to shared data structures. We consider several problems in high performance genomics analysis, including alignment, profiling, clustering, and assembly for both single genomes and metagenomes. We identify some of the common computational patterns or motifs that help inform parallelization strategies and compare our motifs to some of the established lists, arguing that at least two key patterns, sorting and hashing, are missing

    Architectures and GPU-Based Parallelization for Online Bayesian Computational Statistics and Dynamic Modeling

    Get PDF
    Recent work demonstrates that coupling Bayesian computational statistics methods with dynamic models can facilitate the analysis of complex systems associated with diverse time series, including those involving social and behavioural dynamics. Particle Markov Chain Monte Carlo (PMCMC) methods constitute a particularly powerful class of Bayesian methods combining aspects of batch Markov Chain Monte Carlo (MCMC) and the sequential Monte Carlo method of Particle Filtering (PF). PMCMC can flexibly combine theory-capturing dynamic models with diverse empirical data. Online machine learning is a subcategory of machine learning algorithms characterized by sequential, incremental execution as new data arrives, which can give updated results and predictions with growing sequences of available incoming data. While many machine learning and statistical methods are adapted to online algorithms, PMCMC is one example of the many methods whose compatibility with and adaption to online learning remains unclear. In this thesis, I proposed a data-streaming solution supporting PF and PMCMC methods with dynamic epidemiological models and demonstrated several successful applications. By constructing an automated, easy-to-use streaming system, analytic applications and simulation models gain access to arriving real-time data to shorten the time gap between data and resulting model-supported insight. The well-defined architecture design emerging from the thesis would substantially expand traditional simulation models' potential by allowing such models to be offered as continually updated services. Contingent on sufficiently fast execution time, simulation models within this framework can consume the incoming empirical data in real-time and generate informative predictions on an ongoing basis as new data points arrive. In a second line of work, I investigated the platform's flexibility and capability by extending this system to support the use of a powerful class of PMCMC algorithms with dynamic models while ameliorating such algorithms' traditionally stiff performance limitations. Specifically, this work designed and implemented a GPU-enabled parallel version of a PMCMC method with dynamic simulation models. The resulting codebase readily has enabled researchers to adapt their models to the state-of-art statistical inference methods, and ensure that the computation-heavy PMCMC method can perform significant sampling between the successive arrival of each new data point. Investigating this method's impact with several realistic PMCMC application examples showed that GPU-based acceleration allows for up to 160x speedup compared to a corresponding CPU-based version not exploiting parallelism. The GPU accelerated PMCMC and the streaming processing system can complement each other, jointly providing researchers with a powerful toolset to greatly accelerate learning and securing additional insight from the high-velocity data increasingly prevalent within social and behavioural spheres. The design philosophy applied supported a platform with broad generalizability and potential for ready future extensions. The thesis discusses common barriers and difficulties in designing and implementing such systems and offers solutions to solve or mitigate them

    Fast algorithm for real-time rings reconstruction

    Get PDF
    The GAP project is dedicated to study the application of GPU in several contexts in which real-time response is important to take decisions. The definition of real-time depends on the application under study, ranging from answer time of ÎĽs up to several hours in case of very computing intensive task. During this conference we presented our work in low level triggers [1] [2] and high level triggers [3] in high energy physics experiments, and specific application for nuclear magnetic resonance (NMR) [4] [5] and cone-beam CT [6]. Apart from the study of dedicated solution to decrease the latency due to data transport and preparation, the computing algorithms play an essential role in any GPU application. In this contribution, we show an original algorithm developed for triggers application, to accelerate the ring reconstruction in RICH detector when it is not possible to have seeds for reconstruction from external trackers

    GPU-accelerated algorithms for many-particle continuous-time quantum walks

    Get PDF
    Many-particle continuous-time quantum walks (CTQWs) represent a resource for several tasks in quantum technology, including quantum search algorithms and universal quantum computation. In order to design and implement CTQWs in a realistic scenario, one needs effective simulation tools for Hamiltonians that take into account static noise and fluctuations in the lattice, i.e. Hamiltonians containing stochastic terms. To this aim, we suggest a parallel algorithm based on the Taylor series expansion of the evolution operator, and compare its performances with those of algorithms based on the exact diagonalization of the Hamiltonian or a 4th order Runge–Kutta integration. We prove that both Taylor-series expansion and Runge–Kutta algorithms are reliable and have a low computational cost, the Taylor-series expansion showing the additional advantage of a memory allocation not depending on the precision of calculation. Both algorithms are also highly parallelizable within the SIMT paradigm, and are thus suitable for GPGPU computing. In turn, we have benchmarked 4 NVIDIA GPUs and 3 quad-core Intel CPUs for a 2-particle system over lattices of increasing dimension, showing that the speedup provided by GPU computing, with respect to the OPENMP parallelization, lies in the range between 8x and (more than) 20x, depending on the frequency of post-processing. GPU-accelerated codes thus allow one to overcome concerns about the execution time, and make it possible simulations with many interacting particles on large lattices, with the only limit of the memory available on the device. Program summary Program Title: cuQuWa Licensing provisions: GNU General Public License, version 3 Program Files doi: http://dx.doi.org/10.17632/vjpnjgycdj.1 Programming language: CUDA C Nature of problem: Evolution of many-particle continuous-time quantum-walks on a multidimensional grid in a noisy environment. The submitted code is specialized for the simulation of 2-particle quantum-walks with periodic boundary conditions. Solution method: Taylor-series expansion of the evolution operator. The density-matrix is calculated by averaging multiple independent realizations of the system. External routines: cuBLAS, cuRAND Unusual features: Simulations are run exclusively on the graphic processing unit within the CUDA environment. An undocumented misbehavior in the random-number generation routine (cuRAND package) can corrupt the simulation of large systems, though no problems are reported for small and medium-size systems. Compiling the code with the -arch=sm_30 flag for compute capability 3.5 and above fixes this issue

    GPU-accelerated algorithms for many-particle continuous-time quantum walks

    Get PDF
    Many-particle continuous-time quantum walks (CTQWs) represent a resource for several tasks in quantum technology, including quantum search algorithms and universal quantum computation. In order to design and implement CTQWs in a realistic scenario, one needs effective simulation tools for Hamiltonians that take into account static noise and fluctuations in the lattice, i.e.\ua0Hamiltonians containing stochastic terms. To this aim, we suggest a parallel algorithm based on the Taylor series expansion of the evolution operator, and compare its performances with those of algorithms based on the exact diagonalization of the Hamiltonian or a 4th order Runge\u2013Kutta integration. We prove that both Taylor-series expansion and Runge\u2013Kutta algorithms are reliable and have a low computational cost, the Taylor-series expansion showing the additional advantage of a memory allocation not depending on the precision of calculation. Both algorithms are also highly parallelizable within the SIMT paradigm, and are thus suitable for GPGPU computing. In turn, we have benchmarked 4 NVIDIA GPUs and 3 quad-core Intel CPUs for a 2-particle system over lattices of increasing dimension, showing that the speedup provided by GPU computing, with respect to the OPENMP parallelization, lies in the range between 8x and (more than) 20x, depending on the frequency of post-processing. GPU-accelerated codes thus allow one to overcome concerns about the execution time, and make it possible simulations with many interacting particles on large lattices, with the only limit of the memory available on the device. Program summary Program Title: cuQuWa Licensing provisions: GNU General Public License, version 3 Program Files doi: http://dx.doi.org/10.17632/vjpnjgycdj.1 Programming language: CUDA C Nature of problem: Evolution of many-particle continuous-time quantum-walks on a multidimensional grid in a noisy environment. The submitted code is specialized for the simulation of 2-particle quantum-walks with periodic boundary conditions. Solution method: Taylor-series expansion of the evolution operator. The density-matrix is calculated by averaging multiple independent realizations of the system. External routines: cuBLAS, cuRAND Unusual features: Simulations are run exclusively on the graphic processing unit within the CUDA environment. An undocumented misbehavior in the random-number generation routine (cuRAND package) can corrupt the simulation of large systems, though no problems are reported for small and medium-size systems. Compiling the code with the -arch=sm_30 flag for compute capability 3.5 and above fixes this issue
    • …
    corecore