455 research outputs found
A pilgrimage to gravity on GPUs
In this short review we present the developments over the last 5 decades that
have led to the use of Graphics Processing Units (GPUs) for astrophysical
simulations. Since the introduction of NVIDIA's Compute Unified Device
Architecture (CUDA) in 2007 the GPU has become a valuable tool for N-body
simulations and is so popular these days that almost all papers about high
precision N-body simulations use methods that are accelerated by GPUs. With the
GPU hardware becoming more advanced and being used for more advanced algorithms
like gravitational tree-codes we see a bright future for GPU like hardware in
computational astrophysics.Comment: To appear in: European Physical Journal "Special Topics" : "Computer
Simulations on Graphics Processing Units" . 18 pages, 8 figure
Dynamical Processes in Globular Clusters
Globular clusters are among the most congested stellar systems in the
Universe. Internal dynamical evolution drives them toward states of high
central density, while simultaneously concentrating the most massive stars and
binary systems in their cores. As a result, these clusters are expected to be
sites of frequent close encounters and physical collisions between stars and
binaries, making them efficient factories for the production of interesting and
observable astrophysical exotica. I describe some elements of the competition
among stellar dynamics, stellar evolution, and other processes that control
globular cluster dynamics, with particular emphasis on pathways that may lead
to the formation of blue stragglers.Comment: Chapter 10, in Ecology of Blue Straggler Stars, H.M.J. Boffin, G.
Carraro & G. Beccari (Eds), Astrophysics and Space Science Library, Springe
A sparse octree gravitational N-body code that runs entirely on the GPU processor
We present parallel algorithms for constructing and traversing sparse octrees
on graphics processing units (GPUs). The algorithms are based on parallel-scan
and sort methods. To test the performance and feasibility, we implemented them
in CUDA in the form of a gravitational tree-code which completely runs on the
GPU.(The code is publicly available at:
http://castle.strw.leidenuniv.nl/software.html) The tree construction and
traverse algorithms are portable to many-core devices which have support for
CUDA or OpenCL programming languages. The gravitational tree-code outperforms
tuned CPU code during the tree-construction and shows a performance improvement
of more than a factor 20 overall, resulting in a processing rate of more than
2.8 million particles per second.Comment: Accepted version. Published in Journal of Computational Physics. 35
pages, 12 figures, single colum
NBODY6++GPU: Ready for the gravitational million-body problem
Accurate direct -body simulations help to obtain detailed information
about the dynamical evolution of star clusters. They also enable comparisons
with analytical models and Fokker-Planck or Monte-Carlo methods. NBODY6 is a
well-known direct -body code for star clusters, and NBODY6++ is the extended
version designed for large particle number simulations by supercomputers. We
present NBODY6++GPU, an optimized version of NBODY6++ with hybrid
parallelization methods (MPI, GPU, OpenMP, and AVX/SSE) to accelerate large
direct -body simulations, and in particular to solve the million-body
problem. We discuss the new features of the NBODY6++GPU code, benchmarks, as
well as the first results from a simulation of a realistic globular cluster
initially containing a million particles. For million-body simulations,
NBODY6++GPU is times faster than NBODY6 with 320 CPU cores and 32
NVIDIA K20X GPUs. With this computing cluster specification, the simulations of
million-body globular clusters including primordial binaries require
about an hour per half-mass crossing time.Comment: 13 pages, 9 figures, 3 table
Numerical Simulations of the Dark Universe: State of the Art and the Next Decade
We present a review of the current state of the art of cosmological dark
matter simulations, with particular emphasis on the implications for dark
matter detection efforts and studies of dark energy. This review is intended
both for particle physicists, who may find the cosmological simulation
literature opaque or confusing, and for astro-physicists, who may not be
familiar with the role of simulations for observational and experimental probes
of dark matter and dark energy. Our work is complementary to the contribution
by M. Baldi in this issue, which focuses on the treatment of dark energy and
cosmic acceleration in dedicated N-body simulations. Truly massive dark
matter-only simulations are being conducted on national supercomputing centers,
employing from several billion to over half a trillion particles to simulate
the formation and evolution of cosmologically representative volumes (cosmic
scale) or to zoom in on individual halos (cluster and galactic scale). These
simulations cost millions of core-hours, require tens to hundreds of terabytes
of memory, and use up to petabytes of disk storage. The field is quite
internationally diverse, with top simulations having been run in China, France,
Germany, Korea, Spain, and the USA. Predictions from such simulations touch on
almost every aspect of dark matter and dark energy studies, and we give a
comprehensive overview of this connection. We also discuss the limitations of
the cold and collisionless DM-only approach, and describe in some detail
efforts to include different particle physics as well as baryonic physics in
cosmological galaxy formation simulations, including a discussion of recent
results highlighting how the distribution of dark matter in halos may be
altered. We end with an outlook for the next decade, presenting our view of how
the field can be expected to progress. (abridged)Comment: 54 pages, 4 figures, 3 tables; invited contribution to the special
issue "The next decade in Dark Matter and Dark Energy" of the new Open Access
journal "Physics of the Dark Universe". Replaced with accepted versio
ASCR/HEP Exascale Requirements Review Report
This draft report summarizes and details the findings, results, and
recommendations derived from the ASCR/HEP Exascale Requirements Review meeting
held in June, 2015. The main conclusions are as follows. 1) Larger, more
capable computing and data facilities are needed to support HEP science goals
in all three frontiers: Energy, Intensity, and Cosmic. The expected scale of
the demand at the 2025 timescale is at least two orders of magnitude -- and in
some cases greater -- than that available currently. 2) The growth rate of data
produced by simulations is overwhelming the current ability, of both facilities
and researchers, to store and analyze it. Additional resources and new
techniques for data analysis are urgently needed. 3) Data rates and volumes
from HEP experimental facilities are also straining the ability to store and
analyze large and complex data volumes. Appropriately configured
leadership-class facilities can play a transformational role in enabling
scientific discovery from these datasets. 4) A close integration of HPC
simulation and data analysis will aid greatly in interpreting results from HEP
experiments. Such an integration will minimize data movement and facilitate
interdependent workflows. 5) Long-range planning between HEP and ASCR will be
required to meet HEP's research needs. To best use ASCR HPC resources the
experimental HEP program needs a) an established long-term plan for access to
ASCR computational and data resources, b) an ability to map workflows onto HPC
resources, c) the ability for ASCR facilities to accommodate workflows run by
collaborations that can have thousands of individual members, d) to transition
codes to the next-generation HPC platforms that will be available at ASCR
facilities, e) to build up and train a workforce capable of developing and
using simulations and analysis to support HEP scientific research on
next-generation systems.Comment: 77 pages, 13 Figures; draft report, subject to further revisio
A fully parallel, high precision, N-body code running on hybrid computing platforms
We present a new implementation of the numerical integration of the
classical, gravitational, N-body problem based on a high order Hermite's
integration scheme with block time steps, with a direct evaluation of the
particle-particle forces. The main innovation of this code (called HiGPUs) is
its full parallelization, exploiting both OpenMP and MPI in the use of the
multicore Central Processing Units as well as either Compute Unified Device
Architecture (CUDA) or OpenCL for the hosted Graphic Processing Units. We
tested both performance and accuracy of the code using up to 256 GPUs in the
supercomputer IBM iDataPlex DX360M3 Linux Infiniband Cluster provided by the
italian supercomputing consortium CINECA, for values of N up to 8 millions. We
were able to follow the evolution of a system of 8 million bodies for few
crossing times, task previously unreached by direct summation codes. The code
is freely available to the scientific community.Comment: Paper submitted to Journal of Computational Physics consisting in 28
pages, 9 figures.The previous submitted version was lacking of the
bibliography, for a Tex proble
- …