215,131 research outputs found
Direct N-body Simulations
Special high-accuracy direct force summation N-body algorithms and their
relevance for the simulation of the dynamical evolution of star clusters and
other gravitating N-body systems in astrophysics are presented, explained and
compared with other methods. Other methods means here approximate physical
models based on the Fokker-Planck equation as well as other, approximate
algorithms to compute the gravitational potential in N-body systems. Questions
regarding the parallel implementation of direct ``brute force'' N-body codes
are discussed. The astrophysical application of the models to the theory of
relaxing rotating and non-rotating collisional star clusters is presented,
briefly mentioning the questions of the validity of the Fokker-Planck
approximation, the existence of gravothermal oscillations and of rotation and
primordial binaries.Comment: 32 pages, 13 figures, in press in Riffert, H., Werner K. (eds),
Computational Astrophysics, The Journal of Computational and Applied
Mathematics (JCAM), Elsevier Press, Amsterdam, 199
Parallel Deterministic and Stochastic Global Minimization of Functions with Very Many Minima
The optimization of three problems with high dimensionality and many local minima are investigated
under five different optimization algorithms: DIRECT, simulated annealing, Spallâs SPSA algorithm, the KNITRO
package, and QNSTOP, a new algorithm developed at Indiana University
A sparse octree gravitational N-body code that runs entirely on the GPU processor
We present parallel algorithms for constructing and traversing sparse octrees
on graphics processing units (GPUs). The algorithms are based on parallel-scan
and sort methods. To test the performance and feasibility, we implemented them
in CUDA in the form of a gravitational tree-code which completely runs on the
GPU.(The code is publicly available at:
http://castle.strw.leidenuniv.nl/software.html) The tree construction and
traverse algorithms are portable to many-core devices which have support for
CUDA or OpenCL programming languages. The gravitational tree-code outperforms
tuned CPU code during the tree-construction and shows a performance improvement
of more than a factor 20 overall, resulting in a processing rate of more than
2.8 million particles per second.Comment: Accepted version. Published in Journal of Computational Physics. 35
pages, 12 figures, single colum
Performance analysis of direct N-body algorithms for astrophysical simulations on distributed systems
We discuss the performance of direct summation codes used in the simulation
of astrophysical stellar systems on highly distributed architectures. These
codes compute the gravitational interaction among stars in an exact way and
have an O(N^2) scaling with the number of particles. They can be applied to a
variety of astrophysical problems, like the evolution of star clusters, the
dynamics of black holes, the formation of planetary systems, and cosmological
simulations. The simulation of realistic star clusters with sufficiently high
accuracy cannot be performed on a single workstation but may be possible on
parallel computers or grids. We have implemented two parallel schemes for a
direct N-body code and we study their performance on general purpose parallel
computers and large computational grids. We present the results of timing
analyzes conducted on the different architectures and compare them with the
predictions from theoretical models. We conclude that the simulation of star
clusters with up to a million particles will be possible on large distributed
computers in the next decade. Simulating entire galaxies however will in
addition require new hybrid methods to speedup the calculation.Comment: 22 pages, 8 figures, accepted for publication in Parallel Computin
A modified parallel tree code for N-body simulation of the Large Scale Structure of the Universe
N-body codes to perform simulations of the origin and evolution of the Large
Scale Structure of the Universe have improved significantly over the past
decade both in terms of the resolution achieved and of reduction of the CPU
time. However, state-of-the-art N-body codes hardly allow one to deal with
particle numbers larger than a few 10^7, even on the largest parallel systems.
In order to allow simulations with larger resolution, we have first
re-considered the grouping strategy as described in Barnes (1990) (hereafter
B90) and applied it with some modifications to our WDSH-PT (Work and Data
SHaring - Parallel Tree) code. In the first part of this paper we will give a
short description of the code adopting the Barnes and Hut algorithm
\cite{barh86} (hereafter BH), and in particular of the memory and work
distribution strategy applied to describe the {\it data distribution} on a
CC-NUMA machine like the CRAY-T3E system. In the second part of the paper we
describe the modification to the Barnes grouping strategy we have devised to
improve the performance of the WDSH-PT code. We will use the property that
nearby particles have similar interaction list. This idea has been checked in
B90, where an interaction list is builded which applies everywhere within a
cell C_{group} containing a little number of particles N_{crit}. B90 reuses
this interaction list for each particle in the cell in turn.
We will assume each particle p to have the same interaction list.
Thus it has been possible to reduce the CPU time increasing the performances.
This leads us to run simulations with a large number of particles (N ~
10^7/10^9) in non-prohibitive times.Comment: 13 pages and 7 Figure
- âŠ