Search CORE

74 research outputs found

Graphics Processing Units (GPUs) and CUDA

Author: Brett Josiah
Brouwer Josiah
Publication venue: Hope College Digital Commons
Publication date: 12/04/2019
Field of study

Computers almost always contain one or more central processing units (CPU), each of which processes information sequentially. While having multiple CPUs allow a computer to run several tasks in parallel, many computers also have a graphics processing unit (GPU) which contains hundreds to thousands of cores that allow it to execute many computations in parallel. In order to complete a larger task, GPUs run many subtasks concurrently. Each core performs the same instruction on different sets of data, making it useful for performing tasks such as calculating what each individual pixel displays on a screen. The purpose of this research was to learn how GPUs work, how to write CUDA programs to utilize GPUs, and to determine if GPUs could be used to increase the speed of algorithms used to determine the pebbling properties of graphs. In addition, we developed a class module on GPU computing with CUDA for the Advanced Algorithms class in Hope College’s Computer Science department

Digital Commons@Hope College

Fast GPU audio identification

Author: Camarena Ibarrola Antonio
Chávez Edgar
Miranda Natalia Carolina
Piccoli María Fabiana
Publication venue
Publication date: 01/10/2010
Field of study

Audio identification consist in the ability to pair audio signals of the same perceptual nature. In other words, the aim is to be able to compare an audio signal with a modified versions perceptually equivalent. To accomplish that, an audio fingerprint is extracted from the signals and only the fingerprints are compared to asses the similarity. Some guarantee have to be given about the equivalence between comparing audio fingerprints and perceptually comparing the signals. In designing AFPs, a dense representation is more robust than a sparse one. A dense representation also imply more compute cycles and hence a slower processing speed. To speedup the computing of a very dense audio fingerprint, able to stand stable under noise, re-recording, low-pass filtering, etc., we propose the use of a massive parallel architecture based on the Graphics Processing Unit (GPU) with the CUDA programming kit. We prove experimentally that even with a relatively small GPU and using a single core in the GPU, we are able to obtain a notable speedup per core in a GPU/CPU model. We compared our FFT implementation against state of the art CUFFT obtaining impressive results, hence our FFT implementation can help other areas of application.Presentado en el X Workshop Procesamiento Distribuido y Paralelo (WPDP)Red de Universidades con Carreras en Informática (RedUNCI

GPU-Based One-Dimensional Convolution for Real-Time Spatial Sound Generation

Author: Cowan Brent
Kapralos Bill
Publication venue: Canadian Game Studies Association
Publication date: 27/12/2009
Field of study

Incorporating spatialized (3D) sound cues in dynamic and interactive videogames and immersive virtual environment applications is beneficial for a number of reasons, ultimately leading to an increase in presence and immersion. Despite the benefits of spatial sound cues, they are often overlooked in videogames and virtual environments where typically, emphasis is placed on the visual cues. Fundamental to the generation of spatial sound is the one-dimensional convolution operation which is computationally expensive, not lending itself to such real-time, dynamic applications. Driven by the gaming industry and the great emphasis placed on the visual sense, consumer computer graphics hardware, and the graphics processing unit (GPU) in particular, has greatly advanced in recent years, even outperforming the computational capacity of CPUs. This has allowed for real-time, interactive realistic graphics-based applications on typical consumer- level PCs. Given the widespread use and availability of computer graphics hardware and the similarities that exist between the fields of spatial audio and image synthesis, here we describe the development of a GPU-based, one-dimensional convolution algorithm whose efficiency is superior to the conventional CPU-based convolution method. The primary purpose of the developed GPU-based convolution method is the computationally efficient generation of real- time spatial audio for dynamic and interactive videogames and virtual environments

Loading - The Journal of the Canadian Game Studies Association

On the Design and Analysis of Parallel and Distributed Algorithms

Author: Chowdhary K R
Purohit Rajendra
Purohit S D
Publication venue
Publication date: 09/11/2023
Field of study

Arrival of multicore systems has enforced a new scenario in computing, the parallel and distributed algorithms are fast replacing the older sequential algorithms, with many challenges of these techniques. The distributed algorithms provide distributed processing using distributed file systems and processing units, while network is modeled as minimum cost spanning tree. On the other hand, the parallel processing chooses different language platforms, data parallel vs. parallel programming, and GPUs. Processing units, memory elements and storage are connected through dynamic distributed networks in the form of spanning trees. The article presents foundational algorithms, analysis, and efficiency considerations.Comment: 9 page

arXiv.org e-Print Archive

Tapping the Supercomputer Under Your Desk: Solving Dynamic Equilibrium Models with Graphics Processors

Author: A. Ronald Gallant
Eric M. Aldrich
Jesús Fernández-Villaverde
Juan F. Rubio-Ramírez
Publication venue
Publication date
Field of study

This paper shows how to build algorithms that use graphics processing units (GPUs) installed in most modern computers to solve dynamic equilibrium models in economics. In particular, we rely on the compute unified device architecture (CUDA) of NVIDIA GPUs. We illustrate the power of the approach by solving a simple real business cycle model with value function iteration. We document improvements in speed of around 200 times and suggest that even further gains are likely.

Research Papers in Economics

Particle Systems

Author: Bedecs Jakub
Publication venue: Vysoké učení technické v Brně. Fakulta informačních technologií
Publication date: 01/01/2011
Field of study

Tato bakalářská práce se týká implementace částicových systémů s využitím výpočetního výkonu GPU. Klade si za cíl popsat důležitá fakta o stavbě částicových systémů a poukázat na různé možnosti využití. Rozebírá schopnosti moderních shaderů a jejich aplikování na výpočet pohybu částic. Základem práce je analýza implementované aplikace, která dokáže dynamicky měnit všechny parametry systému.This bachelor's thesis deals with the implementation of particle systems with the usage of calculation power of GPU. The purpose of this work is to describe all important facts about the particle systems construction and to show up various possibilities of its usage. It analysis the abilities of modern shaders and their usage for calculation of particles movement. The basis of this work is the analysis of the implemented application, which is able to dynamically change all parameters of the system.

Digital library of Brno University of Technology

National Repository of Grey Literature

A Parallel Application for Tree Selection in the Steiner Minimal Tree Problem

Author: Hegie Joshua M.
Publication venue
Publication date: 03/11/2017
Field of study

A classic optimization problem in mathematics is the problem of determining the shortest possible length for a network of points. One of these problems, that remains relevant even today, is the Steiner Minimal Tree problem. This problem is focused on finding a connected graph for a cloud of points that minimizes the overall distance of the tree. This problem has applications in fields such as telecommunications, determining where to geographically place hubs such that the total length of run cabling is minimized, and for a special case of the problem, circuit design

University of Nevada, Reno ScholarWorks Repository

Exploiting the Logic-In-Memory paradigm for speeding-up data-intensive algorithms

Author: Causapruno G.
Cofano M.
Graziano M.
Santoro G.
Turvani G.
Vacca M.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2019
Field of study

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

CUDA implementation of integration rules within an hp-finite element code

Author: Celorrio de Pablo Ricardo
García Prado Adián
Pardo Zubiaur David
Publication venue: 'Universidad de Zaragoza'
Publication date: 01/01/2012
Field of study

With the introduction in 2006 of CUDA architecture for Nvidia GPUs a new programming model borned. Large number of articles indicates that this new programming model in a new architecture achieves better performance than previous implementations in traditional languages for CPUs. In this work the author tries to show the capabilities of GPU computing. To perform such a task a hp Finite Element integration method is implemented both in CUDA and in C language. After implementation, parallel executions in CPU and GPU will be compared to demonstrate if it is worth to create new algorimths under this architecture

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositorio Universidad de Zaragoza

A configurable general purpose graphics processing unit for power, performance, and area analysis

Author: Lies Garrett
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2019
Field of study

Digital Repository @ Iowa State University (ISU)