1,464 research outputs found
QCD simulations with staggered fermions on GPUs
We report on our implementation of the RHMC algorithm for the simulation of
lattice QCD with two staggered flavors on Graphics Processing Units, using the
NVIDIA CUDA programming language. The main feature of our code is that the GPU
is not used just as an accelerator, but instead the whole Molecular Dynamics
trajectory is performed on it. After pointing out the main bottlenecks and how
to circumvent them, we discuss the obtained performances. We present some
preliminary results regarding OpenCL and multiGPU extensions of our code and
discuss future perspectives.Comment: 22 pages, 14 eps figures, final version to be published in Computer
Physics Communication
Staggered fermions simulations on GPUs
We present our implementation of the RHMC algorithm for staggered fermions on
Graphics Processing Units using the NVIDIA CUDA programming language. While
previous studies exclusively deal with the Dirac matrix inversion problem, our
code performs the complete MD trajectory on the GPU. After pointing out the
main bottlenecks and how to circumvent them, we discuss the performance of our
code.Comment: Poster presented at the XXVIII International Symposium on Lattice
Field Theory, June 14-19, 2010, Villasimius, Sardinia Ital
Sparse approximate inverse preconditioners on high performance GPU platforms
Simulation with models based on partial differential equations often requires the solution of (sequences of) large and sparse algebraic linear systems. In multidimensional domains, preconditioned Krylov iterative solvers are often appropriate for these duties. Therefore, the search for efficient preconditioners for Krylov subspace methods is a crucial theme. Recent developments, especially in computing hardware, have renewed the interest in approximate inverse preconditioners in factorized form, because their application during the solution process can be more efficient. We present here some experiences focused on the approximate inverse preconditioners proposed by Benzi and Tůma from 1996 and the sparsification and inversion proposed by van Duin in 1999. Computational costs, reorderings and implementation issues are considered both on conventional and innovative computing architectures like Graphics Programming Units (GPUs)
Accelerating BST Methods for Model Reduction with Graphics Processors
Model order reduction of dynamical linear time-invariant system appears in many scientific and engineering applications. Numerically reliable SVD-based methods for this task require O(n3) floating-point arithmetic operations, with n being in the range 103 − 105 for many practical applications. In this paper we investigate the use of graphics processors (GPUs) to accelerate model reduction of large-scale linear systems via Balanced Stochastic Truncation, by off-loading the computationally intensive tasks to this device. Experiments on a hybrid platform consisting of state-of-the-art general-purpose multi-core processors and a GPU illustrate the potential of this approach
Acceleration of parasitic multistatic radar system using GPGPU
This dissertation details the implementation of PMR [Parasitic Multistatic Radar] signal processing chain in the GPGPU [General Purpose Graphic Processing Units] platform. The primary objective of the project is to accelerate the signal processing chain without compromising the algorithm efficiency and to prove that GPGPUs are a promising platform for parasitic radar signal processing
Applications of GPU Computing to Control and Simulate Systems
[Abstract] This work deals with the new programming paradigm
that exploits the benefits of modern Graphics
Processing Units (GPUs), specifically their capacity
to carry heavy calculations out for simulating
systems or solving complex control strategies in real
time
- …