215,131 research outputs found

    Direct N-body Simulations

    Get PDF
    Special high-accuracy direct force summation N-body algorithms and their relevance for the simulation of the dynamical evolution of star clusters and other gravitating N-body systems in astrophysics are presented, explained and compared with other methods. Other methods means here approximate physical models based on the Fokker-Planck equation as well as other, approximate algorithms to compute the gravitational potential in N-body systems. Questions regarding the parallel implementation of direct ``brute force'' N-body codes are discussed. The astrophysical application of the models to the theory of relaxing rotating and non-rotating collisional star clusters is presented, briefly mentioning the questions of the validity of the Fokker-Planck approximation, the existence of gravothermal oscillations and of rotation and primordial binaries.Comment: 32 pages, 13 figures, in press in Riffert, H., Werner K. (eds), Computational Astrophysics, The Journal of Computational and Applied Mathematics (JCAM), Elsevier Press, Amsterdam, 199

    Parallel Deterministic and Stochastic Global Minimization of Functions with Very Many Minima

    Get PDF
    The optimization of three problems with high dimensionality and many local minima are investigated under five different optimization algorithms: DIRECT, simulated annealing, Spall’s SPSA algorithm, the KNITRO package, and QNSTOP, a new algorithm developed at Indiana University

    A sparse octree gravitational N-body code that runs entirely on the GPU processor

    Get PDF
    We present parallel algorithms for constructing and traversing sparse octrees on graphics processing units (GPUs). The algorithms are based on parallel-scan and sort methods. To test the performance and feasibility, we implemented them in CUDA in the form of a gravitational tree-code which completely runs on the GPU.(The code is publicly available at: http://castle.strw.leidenuniv.nl/software.html) The tree construction and traverse algorithms are portable to many-core devices which have support for CUDA or OpenCL programming languages. The gravitational tree-code outperforms tuned CPU code during the tree-construction and shows a performance improvement of more than a factor 20 overall, resulting in a processing rate of more than 2.8 million particles per second.Comment: Accepted version. Published in Journal of Computational Physics. 35 pages, 12 figures, single colum

    Performance analysis of direct N-body algorithms for astrophysical simulations on distributed systems

    Full text link
    We discuss the performance of direct summation codes used in the simulation of astrophysical stellar systems on highly distributed architectures. These codes compute the gravitational interaction among stars in an exact way and have an O(N^2) scaling with the number of particles. They can be applied to a variety of astrophysical problems, like the evolution of star clusters, the dynamics of black holes, the formation of planetary systems, and cosmological simulations. The simulation of realistic star clusters with sufficiently high accuracy cannot be performed on a single workstation but may be possible on parallel computers or grids. We have implemented two parallel schemes for a direct N-body code and we study their performance on general purpose parallel computers and large computational grids. We present the results of timing analyzes conducted on the different architectures and compare them with the predictions from theoretical models. We conclude that the simulation of star clusters with up to a million particles will be possible on large distributed computers in the next decade. Simulating entire galaxies however will in addition require new hybrid methods to speedup the calculation.Comment: 22 pages, 8 figures, accepted for publication in Parallel Computin

    A modified parallel tree code for N-body simulation of the Large Scale Structure of the Universe

    Full text link
    N-body codes to perform simulations of the origin and evolution of the Large Scale Structure of the Universe have improved significantly over the past decade both in terms of the resolution achieved and of reduction of the CPU time. However, state-of-the-art N-body codes hardly allow one to deal with particle numbers larger than a few 10^7, even on the largest parallel systems. In order to allow simulations with larger resolution, we have first re-considered the grouping strategy as described in Barnes (1990) (hereafter B90) and applied it with some modifications to our WDSH-PT (Work and Data SHaring - Parallel Tree) code. In the first part of this paper we will give a short description of the code adopting the Barnes and Hut algorithm \cite{barh86} (hereafter BH), and in particular of the memory and work distribution strategy applied to describe the {\it data distribution} on a CC-NUMA machine like the CRAY-T3E system. In the second part of the paper we describe the modification to the Barnes grouping strategy we have devised to improve the performance of the WDSH-PT code. We will use the property that nearby particles have similar interaction list. This idea has been checked in B90, where an interaction list is builded which applies everywhere within a cell C_{group} containing a little number of particles N_{crit}. B90 reuses this interaction list for each particle p∈Cgroup p \in C_{group} in the cell in turn. We will assume each particle p to have the same interaction list. Thus it has been possible to reduce the CPU time increasing the performances. This leads us to run simulations with a large number of particles (N ~ 10^7/10^9) in non-prohibitive times.Comment: 13 pages and 7 Figure
    • 

    corecore