1,464 research outputs found

    QCD simulations with staggered fermions on GPUs

    Full text link
    We report on our implementation of the RHMC algorithm for the simulation of lattice QCD with two staggered flavors on Graphics Processing Units, using the NVIDIA CUDA programming language. The main feature of our code is that the GPU is not used just as an accelerator, but instead the whole Molecular Dynamics trajectory is performed on it. After pointing out the main bottlenecks and how to circumvent them, we discuss the obtained performances. We present some preliminary results regarding OpenCL and multiGPU extensions of our code and discuss future perspectives.Comment: 22 pages, 14 eps figures, final version to be published in Computer Physics Communication

    Staggered fermions simulations on GPUs

    Full text link
    We present our implementation of the RHMC algorithm for staggered fermions on Graphics Processing Units using the NVIDIA CUDA programming language. While previous studies exclusively deal with the Dirac matrix inversion problem, our code performs the complete MD trajectory on the GPU. After pointing out the main bottlenecks and how to circumvent them, we discuss the performance of our code.Comment: Poster presented at the XXVIII International Symposium on Lattice Field Theory, June 14-19, 2010, Villasimius, Sardinia Ital

    Sparse approximate inverse preconditioners on high performance GPU platforms

    Get PDF
    Simulation with models based on partial differential equations often requires the solution of (sequences of) large and sparse algebraic linear systems. In multidimensional domains, preconditioned Krylov iterative solvers are often appropriate for these duties. Therefore, the search for efficient preconditioners for Krylov subspace methods is a crucial theme. Recent developments, especially in computing hardware, have renewed the interest in approximate inverse preconditioners in factorized form, because their application during the solution process can be more efficient. We present here some experiences focused on the approximate inverse preconditioners proposed by Benzi and Tůma from 1996 and the sparsification and inversion proposed by van Duin in 1999. Computational costs, reorderings and implementation issues are considered both on conventional and innovative computing architectures like Graphics Programming Units (GPUs)

    Accelerating BST Methods for Model Reduction with Graphics Processors

    Get PDF
    Model order reduction of dynamical linear time-invariant system appears in many scientific and engineering applications. Numerically reliable SVD-based methods for this task require O(n3) floating-point arithmetic operations, with n being in the range 103 − 105 for many practical applications. In this paper we investigate the use of graphics processors (GPUs) to accelerate model reduction of large-scale linear systems via Balanced Stochastic Truncation, by off-loading the computationally intensive tasks to this device. Experiments on a hybrid platform consisting of state-of-the-art general-purpose multi-core processors and a GPU illustrate the potential of this approach

    Acceleration of parasitic multistatic radar system using GPGPU

    Get PDF
    This dissertation details the implementation of PMR [Parasitic Multistatic Radar] signal processing chain in the GPGPU [General Purpose Graphic Processing Units] platform. The primary objective of the project is to accelerate the signal processing chain without compromising the algorithm efficiency and to prove that GPGPUs are a promising platform for parasitic radar signal processing

    Applications of GPU Computing to Control and Simulate Systems

    Get PDF
    [Abstract] This work deals with the new programming paradigm that exploits the benefits of modern Graphics Processing Units (GPUs), specifically their capacity to carry heavy calculations out for simulating systems or solving complex control strategies in real time
    corecore