425 research outputs found
A pilgrimage to gravity on GPUs
In this short review we present the developments over the last 5 decades that
have led to the use of Graphics Processing Units (GPUs) for astrophysical
simulations. Since the introduction of NVIDIA's Compute Unified Device
Architecture (CUDA) in 2007 the GPU has become a valuable tool for N-body
simulations and is so popular these days that almost all papers about high
precision N-body simulations use methods that are accelerated by GPUs. With the
GPU hardware becoming more advanced and being used for more advanced algorithms
like gravitational tree-codes we see a bright future for GPU like hardware in
computational astrophysics.Comment: To appear in: European Physical Journal "Special Topics" : "Computer
Simulations on Graphics Processing Units" . 18 pages, 8 figure
4.45 Pflops Astrophysical N-Body Simulation on K computer -- The Gravitational Trillion-Body Problem
As an entry for the 2012 Gordon-Bell performance prize, we report performance
results of astrophysical N-body simulations of one trillion particles performed
on the full system of K computer. This is the first gravitational trillion-body
simulation in the world. We describe the scientific motivation, the numerical
algorithm, the parallelization strategy, and the performance analysis. Unlike
many previous Gordon-Bell prize winners that used the tree algorithm for
astrophysical N-body simulations, we used the hybrid TreePM method, for similar
level of accuracy in which the short-range force is calculated by the tree
algorithm, and the long-range force is solved by the particle-mesh algorithm.
We developed a highly-tuned gravity kernel for short-range forces, and a novel
communication algorithm for long-range forces. The average performance on 24576
and 82944 nodes of K computer are 1.53 and 4.45 Pflops, which correspond to 49%
and 42% of the peak speed.Comment: 10 pages, 6 figures, Proceedings of Supercomputing 2012
(http://sc12.supercomputing.org/), Gordon Bell Prize Winner. Additional
information is http://www.ccs.tsukuba.ac.jp/CCS/eng/gbp201
PROGRAPE-1: A Programmable, Multi-Purpose Computer for Many-Body Simulations
We have developed PROGRAPE-1 (PROgrammable GRAPE-1), a programmable
multi-purpose computer for many-body simulations. The main difference between
PROGRAPE-1 and "traditional" GRAPE systems is that the former uses FPGA (Field
Programmable Gate Array) chips as the processing elements, while the latter
rely on the hardwired pipeline processor specialized to gravitational
interactions. Since the logic implemented in FPGA chips can be reconfigured, we
can use PROGRAPE-1 to calculate not only gravitational interactions but also
other forms of interactions such as van der Waals force, hydrodynamical
interactions in SPH calculation and so on. PROGRAPE-1 comprises two Altera
EPF10K100 FPGA chips, each of which contains nominally 100,000 gates. To
evaluate the programmability and performance of PROGRAPE-1, we implemented a
pipeline for gravitational interaction similar to that of GRAPE-3. One pipeline
fitted into a single FPGA chip, which operated at 16 MHz clock. Thus, for
gravitational interaction, PROGRAPE-1 provided the speed of 0.96
Gflops-equivalent. PROGRAPE will prove to be useful for wide-range of
particle-based simulations in which the calculation cost of interactions other
than gravity is high, such as the evaluation of SPH interactions.Comment: 20 pages with 9 figures; submitted to PAS
SPH Simulations with Reconfigurable Hardware Accelerator
We present a novel approach to accelerate astrophysical hydrodynamical
simulations. In astrophysical many-body simulations, GRAPE (GRAvity piPE)
system has been widely used by many researchers. However, in the GRAPE systems,
its function is completely fixed because specially developed LSI is used as a
computing engine. Instead of using such LSI, we are developing a special
purpose computing system using Field Programmable Gate Array (FPGA) chips as
the computing engine. Together with our developed programming system, we have
implemented computing pipelines for the Smoothed Particle Hydrodynamics (SPH)
method on our PROGRAPE-3 system. The SPH pipelines running on PROGRAPE-3 system
have the peak speed of 85 GFLOPS and in a realistic setup, the SPH calculation
using one PROGRAPE-3 board is 5-10 times faster than the calculation on the
host computer. Our results clearly shows for the first time that we can
accelerate the speed of the SPH simulations of a simple astrophysical phenomena
using considerable computing power offered by the hardware.Comment: 27 pages, 13 figures, submitted to PAS
The Living Application: a Self-Organising System for Complex Grid Tasks
We present the living application, a method to autonomously manage
applications on the grid. During its execution on the grid, the living
application makes choices on the resources to use in order to complete its
tasks. These choices can be based on the internal state, or on autonomously
acquired knowledge from external sensors. By giving limited user capabilities
to a living application, the living application is able to port itself from one
resource topology to another. The application performs these actions at
run-time without depending on users or external workflow tools. We demonstrate
this new concept in a special case of a living application: the living
simulation. Today, many simulations require a wide range of numerical solvers
and run most efficiently if specialized nodes are matched to the solvers. The
idea of the living simulation is that it decides itself which grid machines to
use based on the numerical solver currently in use. In this paper we apply the
living simulation to modelling the collision between two galaxies in a test
setup with two specialized computers. This simulation switces at run-time
between a GPU-enabled computer in the Netherlands and a GRAPE-enabled machine
that resides in the United States, using an oct-tree N-body code whenever it
runs in the Netherlands and a direct N-body solver in the United States.Comment: 26 pages, 3 figures, accepted by IJHPC
SAPPORO: A way to turn your graphics cards into a GRAPE-6
We present Sapporo, a library for performing high-precision gravitational
N-body simulations on NVIDIA Graphical Processing Units (GPUs). Our library
mimics the GRAPE-6 library, and N-body codes currently running on GRAPE-6 can
switch to Sapporo by a simple relinking of the library. The precision of our
library is comparable to that of GRAPE-6, even though internally the GPU
hardware is limited to single precision arithmetics. This limitation is
effectively overcome by emulating double precision for calculating the distance
between particles. The performance loss of this operation is small (< 20%)
compared to the advantage of being able to run at high precision. We tested the
library using several GRAPE-6-enabled N-body codes, in particular with Starlab
and phiGRAPE. We measured peak performance of 800 Gflop/s for running with 10^6
particles on a PC with four commercial G92 architecture GPUs (two GeForce
9800GX2). As a production test, we simulated a 32k Plummer model with equal
mass stars well beyond core collapse. The simulation took 41 days, during which
the mean performance was 113 Gflop/s. The GPU did not show any problems from
running in a production environment for such an extended period of time.Comment: 13 pages, 9 figures, accepted to New Astronom
- …