74 research outputs found

    Graphics Processing Units (GPUs) and CUDA

    Get PDF
    Computers almost always contain one or more central processing units (CPU), each of which processes information sequentially. While having multiple CPUs allow a computer to run several tasks in parallel, many computers also have a graphics processing unit (GPU) which contains hundreds to thousands of cores that allow it to execute many computations in parallel. In order to complete a larger task, GPUs run many subtasks concurrently. Each core performs the same instruction on different sets of data, making it useful for performing tasks such as calculating what each individual pixel displays on a screen. The purpose of this research was to learn how GPUs work, how to write CUDA programs to utilize GPUs, and to determine if GPUs could be used to increase the speed of algorithms used to determine the pebbling properties of graphs. In addition, we developed a class module on GPU computing with CUDA for the Advanced Algorithms class in Hope College’s Computer Science department

    Fast GPU audio identification

    Get PDF
    Audio identification consist in the ability to pair audio signals of the same perceptual nature. In other words, the aim is to be able to compare an audio signal with a modified versions perceptually equivalent. To accomplish that, an audio fingerprint is extracted from the signals and only the fingerprints are compared to asses the similarity. Some guarantee have to be given about the equivalence between comparing audio fingerprints and perceptually comparing the signals. In designing AFPs, a dense representation is more robust than a sparse one. A dense representation also imply more compute cycles and hence a slower processing speed. To speedup the computing of a very dense audio fingerprint, able to stand stable under noise, re-recording, low-pass filtering, etc., we propose the use of a massive parallel architecture based on the Graphics Processing Unit (GPU) with the CUDA programming kit. We prove experimentally that even with a relatively small GPU and using a single core in the GPU, we are able to obtain a notable speedup per core in a GPU/CPU model. We compared our FFT implementation against state of the art CUFFT obtaining impressive results, hence our FFT implementation can help other areas of application.Presentado en el X Workshop Procesamiento Distribuido y Paralelo (WPDP)Red de Universidades con Carreras en Informática (RedUNCI

    GPU-Based One-Dimensional Convolution for Real-Time Spatial Sound Generation

    Get PDF
    Incorporating spatialized (3D) sound cues in dynamic and interactive videogames and immersive virtual environment applications is beneficial for a number of reasons, ultimately leading to an increase in presence and immersion. Despite the benefits of spatial sound cues, they are often overlooked in videogames and virtual environments where typically, emphasis is placed on the visual cues. Fundamental to the generation of spatial sound is the one-dimensional convolution operation which is computationally expensive, not lending itself to such real-time, dynamic applications. Driven by the gaming industry and the great emphasis placed on the visual sense, consumer computer graphics hardware, and the graphics processing unit (GPU) in particular, has greatly advanced in recent years, even outperforming the computational capacity of CPUs. This has allowed for real-time, interactive realistic graphics-based applications on typical consumer- level PCs. Given the widespread use and availability of computer graphics hardware and the similarities that exist between the fields of spatial audio and image synthesis, here we describe the development of a GPU-based, one-dimensional convolution algorithm whose efficiency is superior to the conventional CPU-based convolution method. The primary purpose of the developed GPU-based convolution method is the computationally efficient generation of real- time spatial audio for dynamic and interactive videogames and virtual environments

    On the Design and Analysis of Parallel and Distributed Algorithms

    Full text link
    Arrival of multicore systems has enforced a new scenario in computing, the parallel and distributed algorithms are fast replacing the older sequential algorithms, with many challenges of these techniques. The distributed algorithms provide distributed processing using distributed file systems and processing units, while network is modeled as minimum cost spanning tree. On the other hand, the parallel processing chooses different language platforms, data parallel vs. parallel programming, and GPUs. Processing units, memory elements and storage are connected through dynamic distributed networks in the form of spanning trees. The article presents foundational algorithms, analysis, and efficiency considerations.Comment: 9 page

    Tapping the Supercomputer Under Your Desk: Solving Dynamic Equilibrium Models with Graphics Processors

    Get PDF
    This paper shows how to build algorithms that use graphics processing units (GPUs) installed in most modern computers to solve dynamic equilibrium models in economics. In particular, we rely on the compute unified device architecture (CUDA) of NVIDIA GPUs. We illustrate the power of the approach by solving a simple real business cycle model with value function iteration. We document improvements in speed of around 200 times and suggest that even further gains are likely.

    Particle Systems

    Get PDF
    Tato bakalářská práce se týká implementace částicových systémů s využitím výpočetního výkonu GPU. Klade si za cíl popsat důležitá fakta o stavbě částicových systémů a poukázat na různé možnosti využití. Rozebírá schopnosti moderních shaderů a jejich aplikování na výpočet pohybu částic. Základem práce je analýza implementované aplikace, která dokáže dynamicky měnit všechny parametry systému.This bachelor's thesis deals with the implementation of particle systems with the usage of calculation power of GPU. The purpose of this work is to describe all important facts about the particle systems construction and to show up various possibilities of its usage. It analysis the abilities of modern shaders and their usage for calculation of particles movement. The basis of this work is the analysis of the implemented application, which is able to dynamically change all parameters of the system.

    A Parallel Application for Tree Selection in the Steiner Minimal Tree Problem

    Get PDF
    A classic optimization problem in mathematics is the problem of determining the shortest possible length for a network of points. One of these problems, that remains relevant even today, is the Steiner Minimal Tree problem. This problem is focused on finding a connected graph for a cloud of points that minimizes the overall distance of the tree. This problem has applications in fields such as telecommunications, determining where to geographically place hubs such that the total length of run cabling is minimized, and for a special case of the problem, circuit design

    CUDA implementation of integration rules within an hp-finite element code

    Get PDF
    With the introduction in 2006 of CUDA architecture for Nvidia GPUs a new programming model borned. Large number of articles indicates that this new programming model in a new architecture achieves better performance than previous implementations in traditional languages for CPUs. In this work the author tries to show the capabilities of GPU computing. To perform such a task a hp Finite Element integration method is implemented both in CUDA and in C language. After implementation, parallel executions in CPU and GPU will be compared to demonstrate if it is worth to create new algorimths under this architecture
    • …
    corecore