Search CORE

781 research outputs found

Interactive visualization of a thin disc around a Schwarzschild black hole

Author: Frauendiener Jörg
Müller Thomas
Publication venue: 'IOP Publishing'
Publication date: 19/06/2012
Field of study

In the first course of general relativity, the Schwarzschild spacetime is the most discussed analytic solution to Einstein's field equations. Unfortunately, there is rarely enough time to study the optical consequences of the bending of light for some advanced examples. In this paper, we present how the visual appearance of a thin disc around a Schwarzschild black hole can be determined interactively by means of an analytic solution to the geodesic equation processed on current high performance graphical processing units. This approach can, in principle, be customized for any other thin disc in a spacetime with geodesics given in closed form. The interactive visualization discussed here can be used either in a first course of general relativity for demonstration purposes only or as a thesis for an enthusiastic student in an advanced course with some basic knowledge of OpenGL and a programming language.Comment: 9 pages, 4 figure

arXiv.org e-Print Archive

Crossref

Brook Auto: High-Level Certification-Friendly Programming for GPU-powered Automotive Systems

Author: Kosmidis Leonidas
Trompouki Matina M.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2018
Field of study

Modern automotive systems require increased performance to implement Advanced Driving Assistance Systems (ADAS). GPU-powered platforms are promising candidates for such computational tasks, however current low-level programming models challenge the accelerator software certification process, while they limit the hardware selection to a fraction of the available platforms. In this paper we present Brook Auto, a high-level programming language for automotive GPU systems which removes these limitations. We describe the challenges and solutions we faced in its implementation, as well as a complete evaluation in terms of performance and productivity, which shows the effectiveness of our method.This work has been partially supported by the Spanish Ministry of Science and Innovation under grant TIN2015-65316-P and the HiPEAC Network of Excellence.Peer ReviewedPostprint (author's final draft

Crossref

UPCommons. Portal del coneixement obert de la UPC

Implementation of digital pheromones in PSO accelerated by commodity Graphics Hardware

Author: Direct
Engelbrecht A.
Fernando R.
Hu X.
Kalivarapu V.
Lefohn A.
McCormick P.
Owens J.
Rost R.
Rost R.
Schutte
Schutte J.
Sutter H.
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2008
Field of study

In this paper, a model for Graphics Processing Unit (GPU) implementation of Particle Swarm Optimization (PSO) using digital pheromones to coordinate swarms within ndimensional design spaces is presented. Previous work by the authors demonstrated the capability of digital pheromones within PSO for searching n-dimensional design spaces with improved accuracy, efficiency and reliability in both serial and parallel computing environments using traditional CPUs. Modern GPUs have proven to outperform the number of floating point operations when compared to CPUs through inherent data parallel architecture and higher bandwidth capabilities. The advent of programmable graphics hardware in the recent times further provided a suitable platform for scientific computing particularly in the field of design optimization. However, the data parallel architecture of GPUs requires a specialized formulation for leveraging its computational capabilities. When the objective function computations are appropriately formulated for GPUs, it is theorized that the solution efficiency (speed) can be significantly increased while maintaining solution accuracy. The development of this method together with a number of multi-modal unconstrained test problems are tested and presented in this paper

Digital Repository @ Iowa State University (ISU)

Crossref

High-Level GPU Programming: Domain-Specific Optimization and Inference

Author: Lejdfors Calle
Publication venue
Publication date: 01/01/2008
Field of study

When writing computer software one is often forced to balance the need for high run-time performance with high programmer productivity. By using a high-level language it is often possible to cut development times, but this typically comes at the cost of reduced run-time performance. Using a lower-level language, programs can be made very efficient but at the cost of increased development time. Real-time computer graphics is an area where there are very high demands on both performance and visual quality. Typically, large portions of such applications are written in lower-level languages and also rely on dedicated hardware, in the form of programmable graphics processing units (GPUs), for handling computationally demanding rendering algorithms. These GPUs are parallel stream processors, specialized towards computer graphics, that have computational performance more than a magnitude higher than corresponding CPUs. This has revolutionized computer graphics and also led to GPUs being used to solve more general numerical problems, such as fluid and physics simulation, protein folding, image processing, and databases. Unfortunately, the highly specialized nature of GPUs has also made them difficult to program. In this dissertation we show that GPUs can be programmed at a higher level, while maintaining performance, compared to current lower-level languages. By constructing a domain-specific language (DSL), which provides appropriate domain-specific abstractions and user-annotations, it is possible to write programs in a more abstract and modular manner. Using knowledge of the domain it is possible for the DSL compiler to generate very efficient code. We show that, by experiment, the performance of our DSLs is equal to that of GPU programs written by hand using current low-level languages. Also, control over the trade-offs between visual quality and performance is retained. In the papers included in this dissertation, we present domain-specific languages targeted at numerical processing and computer graphics, respectively. These DSL have been implemented as embedded languages in Python, a dynamic programming language that provide a rich set of high-level features. In this dissertation we show how these features can be used to facilitate the construction of embedded languages

Lund University Publications

High Performance Direct Gravitational N-body Simulations on Graphics Processing Units -- II: An implementation in CUDA

Author: Aarseth
Barnes
Buck
Fernando
Heggie
Jeroen Bédorf
Makino
Makino
Mark
McMillan
Moore
Nitadori
Owens
Owens
Pharr
Portegies Zwart
Portegies Zwart
Robert G. Belleman
Simon F. Portegies Zwart
Warren
Publication venue: 'Elsevier BV'
Publication date: 16/07/2007
Field of study

We present the results of gravitational direct

N

-body simulations using the Graphics Processing Unit (GPU) on a commercial NVIDIA GeForce 8800GTX designed for gaming computers. The force evaluation of the

N

-body problem is implemented in ``Compute Unified Device Architecture'' (CUDA) using the GPU to speed-up the calculations. We tested the implementation on three different

N

-body codes: two direct

N

-body integration codes, using the 4th order predictor-corrector Hermite integrator with block time-steps, and one Barnes-Hut treecode, which uses a 2nd order leapfrog integration scheme. The integration of the equations of motions for all codes is performed on the host CPU. We find that for

N > 512

particles the GPU outperforms the GRAPE-6Af, if some softening in the force calculation is accepted. Without softening and for very small integration time steps the GRAPE still outperforms the GPU. We conclude that modern GPUs offer an attractive alternative to GRAPE-6Af special purpose hardware. Using the same time-step criterion, the total energy of the

N

-body system was conserved better than to one in

10^6

on the GPU, only about an order of magnitude worse than obtained with GRAPE-6Af. For N \apgt 10^5 the 8800GTX outperforms the host CPU by a factor of about 100 and runs at about the same speed as the GRAPE-6Af.Comment: Accepted for publication in New Astronom

arXiv.org e-Print Archive

Crossref

Leiden University Scholary Publications

UvA-DARE

International Migration, Integration and Social Cohesion online publications