249 research outputs found

    High Performance Direct Gravitational N-body Simulations on Graphics Processing Units -- II: An implementation in CUDA

    Get PDF
    We present the results of gravitational direct NN-body simulations using the Graphics Processing Unit (GPU) on a commercial NVIDIA GeForce 8800GTX designed for gaming computers. The force evaluation of the NN-body problem is implemented in ``Compute Unified Device Architecture'' (CUDA) using the GPU to speed-up the calculations. We tested the implementation on three different NN-body codes: two direct NN-body integration codes, using the 4th order predictor-corrector Hermite integrator with block time-steps, and one Barnes-Hut treecode, which uses a 2nd order leapfrog integration scheme. The integration of the equations of motions for all codes is performed on the host CPU. We find that for N>512N > 512 particles the GPU outperforms the GRAPE-6Af, if some softening in the force calculation is accepted. Without softening and for very small integration time steps the GRAPE still outperforms the GPU. We conclude that modern GPUs offer an attractive alternative to GRAPE-6Af special purpose hardware. Using the same time-step criterion, the total energy of the NN-body system was conserved better than to one in 10610^6 on the GPU, only about an order of magnitude worse than obtained with GRAPE-6Af. For N \apgt 10^5 the 8800GTX outperforms the host CPU by a factor of about 100 and runs at about the same speed as the GRAPE-6Af.Comment: Accepted for publication in New Astronom

    Optimal softening for force calculations in collisionless N-body simulations

    Full text link
    In N-body simulations the force calculated between particles representing a given mass distribution is usually softened, to diminish the effect of graininess. In this paper we study the effect of such a smoothing, with the aim of finding an optimal value of the softening parameter. As already shown by Merritt (1996), for too small a softening the estimates of the forces will be too noisy, while for too large a softening the force estimates are systematically misrepresented. In between there is an optimal softening, for which the forces in the configuration approach best the true forces. The value of this optimal softening depends both on the mass distribution and on the number of particles used to represent it. For higher number of particles the optimal softening is smaller. More concentrated mass distributions necessitate smaller softening, but the softened forces are never as good an approximation of the true forces as for not centrally concentrated configurations. We give good estimates of the optimal softening for homogeneous spheres, Plummer spheres, and Dehnen spheres. We also give a rough estimate of this quantity for other mass distributions, based on the harmonic mean distance to the kkth neighbour (kk = 1, .., 12), the mean being taken over all particles in the configuration. Comparing homogeneous Ferrers ellipsoids of different shapes we show that the axial ratios do not influence the value of the optimal softening. Finally we compare two different types of softening, a spline softening (Hernquist & Katz 1989) and a generalisation of the standard Plummer softening to higher values of the exponent. We find that the spline softening fares roughly as well as the higher powers of the power-law softening and both give a better representation of the forces than the standard Plummer softening.Comment: 16 pages Latex, 19 figures, accepted for publication in MNRAS, corrected typos, minor changes mainly in sec.

    Improvements to the APBS biomolecular solvation software suite

    Full text link
    The Adaptive Poisson-Boltzmann Solver (APBS) software was developed to solve the equations of continuum electrostatics for large biomolecular assemblages that has provided impact in the study of a broad range of chemical, biological, and biomedical applications. APBS addresses three key technology challenges for understanding solvation and electrostatics in biomedical applications: accurate and efficient models for biomolecular solvation and electrostatics, robust and scalable software for applying those theories to biomolecular systems, and mechanisms for sharing and analyzing biomolecular electrostatics data in the scientific community. To address new research applications and advancing computational capabilities, we have continually updated APBS and its suite of accompanying software since its release in 2001. In this manuscript, we discuss the models and capabilities that have recently been implemented within the APBS software package including: a Poisson-Boltzmann analytical and a semi-analytical solver, an optimized boundary element solver, a geometry-based geometric flow solvation model, a graph theory based algorithm for determining pKaK_a values, and an improved web-based visualization tool for viewing electrostatics

    2HOT: An Improved Parallel Hashed Oct-Tree N-Body Algorithm for Cosmological Simulation

    Full text link
    We report on improvements made over the past two decades to our adaptive treecode N-body method (HOT). A mathematical and computational approach to the cosmological N-body problem is described, with performance and scalability measured up to 256k (2182^{18}) processors. We present error analysis and scientific application results from a series of more than ten 69 billion (409634096^3) particle cosmological simulations, accounting for 4Ă—10204 \times 10^{20} floating point operations. These results include the first simulations using the new constraints on the standard model of cosmology from the Planck satellite. Our simulations set a new standard for accuracy and scientific throughput, while meeting or exceeding the computational efficiency of the latest generation of hybrid TreePM N-body methods.Comment: 12 pages, 8 figures, 77 references; To appear in Proceedings of SC '1

    A GPU-accelerated Direct-sum Boundary Integral Poisson-Boltzmann Solver

    Full text link
    In this paper, we present a GPU-accelerated direct-sum boundary integral method to solve the linear Poisson-Boltzmann (PB) equation. In our method, a well-posed boundary integral formulation is used to ensure the fast convergence of Krylov subspace based linear algebraic solver such as the GMRES. The molecular surfaces are discretized with flat triangles and centroid collocation. To speed up our method, we take advantage of the parallel nature of the boundary integral formulation and parallelize the schemes within CUDA shared memory architecture on GPU. The schemes use only 11N+6Nc11N+6N_c size-of-double device memory for a biomolecule with NN triangular surface elements and NcN_c partial charges. Numerical tests of these schemes show well-maintained accuracy and fast convergence. The GPU implementation using one GPU card (Nvidia Tesla M2070) achieves 120-150X speed-up to the implementation using one CPU (Intel L5640 2.27GHz). With our approach, solving PB equations on well-discretized molecular surfaces with up to 300,000 boundary elements will take less than about 10 minutes, hence our approach is particularly suitable for fast electrostatics computations on small to medium biomolecules
    • …
    corecore