249 research outputs found
High Performance Direct Gravitational N-body Simulations on Graphics Processing Units -- II: An implementation in CUDA
We present the results of gravitational direct -body simulations using the
Graphics Processing Unit (GPU) on a commercial NVIDIA GeForce 8800GTX designed
for gaming computers. The force evaluation of the -body problem is
implemented in ``Compute Unified Device Architecture'' (CUDA) using the GPU to
speed-up the calculations. We tested the implementation on three different
-body codes: two direct -body integration codes, using the 4th order
predictor-corrector Hermite integrator with block time-steps, and one
Barnes-Hut treecode, which uses a 2nd order leapfrog integration scheme. The
integration of the equations of motions for all codes is performed on the host
CPU.
We find that for particles the GPU outperforms the GRAPE-6Af, if
some softening in the force calculation is accepted. Without softening and for
very small integration time steps the GRAPE still outperforms the GPU. We
conclude that modern GPUs offer an attractive alternative to GRAPE-6Af special
purpose hardware. Using the same time-step criterion, the total energy of the
-body system was conserved better than to one in on the GPU, only
about an order of magnitude worse than obtained with GRAPE-6Af. For N \apgt
10^5 the 8800GTX outperforms the host CPU by a factor of about 100 and runs at
about the same speed as the GRAPE-6Af.Comment: Accepted for publication in New Astronom
Optimal softening for force calculations in collisionless N-body simulations
In N-body simulations the force calculated between particles representing a
given mass distribution is usually softened, to diminish the effect of
graininess. In this paper we study the effect of such a smoothing, with the aim
of finding an optimal value of the softening parameter. As already shown by
Merritt (1996), for too small a softening the estimates of the forces will be
too noisy, while for too large a softening the force estimates are
systematically misrepresented. In between there is an optimal softening, for
which the forces in the configuration approach best the true forces. The value
of this optimal softening depends both on the mass distribution and on the
number of particles used to represent it. For higher number of particles the
optimal softening is smaller. More concentrated mass distributions necessitate
smaller softening, but the softened forces are never as good an approximation
of the true forces as for not centrally concentrated configurations. We give
good estimates of the optimal softening for homogeneous spheres, Plummer
spheres, and Dehnen spheres. We also give a rough estimate of this quantity for
other mass distributions, based on the harmonic mean distance to the th
neighbour ( = 1, .., 12), the mean being taken over all particles in the
configuration. Comparing homogeneous Ferrers ellipsoids of different shapes we
show that the axial ratios do not influence the value of the optimal softening.
Finally we compare two different types of softening, a spline softening
(Hernquist & Katz 1989) and a generalisation of the standard Plummer softening
to higher values of the exponent. We find that the spline softening fares
roughly as well as the higher powers of the power-law softening and both give a
better representation of the forces than the standard Plummer softening.Comment: 16 pages Latex, 19 figures, accepted for publication in MNRAS,
corrected typos, minor changes mainly in sec.
Improvements to the APBS biomolecular solvation software suite
The Adaptive Poisson-Boltzmann Solver (APBS) software was developed to solve
the equations of continuum electrostatics for large biomolecular assemblages
that has provided impact in the study of a broad range of chemical, biological,
and biomedical applications. APBS addresses three key technology challenges for
understanding solvation and electrostatics in biomedical applications: accurate
and efficient models for biomolecular solvation and electrostatics, robust and
scalable software for applying those theories to biomolecular systems, and
mechanisms for sharing and analyzing biomolecular electrostatics data in the
scientific community. To address new research applications and advancing
computational capabilities, we have continually updated APBS and its suite of
accompanying software since its release in 2001. In this manuscript, we discuss
the models and capabilities that have recently been implemented within the APBS
software package including: a Poisson-Boltzmann analytical and a
semi-analytical solver, an optimized boundary element solver, a geometry-based
geometric flow solvation model, a graph theory based algorithm for determining
p values, and an improved web-based visualization tool for viewing
electrostatics
2HOT: An Improved Parallel Hashed Oct-Tree N-Body Algorithm for Cosmological Simulation
We report on improvements made over the past two decades to our adaptive
treecode N-body method (HOT). A mathematical and computational approach to the
cosmological N-body problem is described, with performance and scalability
measured up to 256k () processors. We present error analysis and
scientific application results from a series of more than ten 69 billion
() particle cosmological simulations, accounting for
floating point operations. These results include the first simulations using
the new constraints on the standard model of cosmology from the Planck
satellite. Our simulations set a new standard for accuracy and scientific
throughput, while meeting or exceeding the computational efficiency of the
latest generation of hybrid TreePM N-body methods.Comment: 12 pages, 8 figures, 77 references; To appear in Proceedings of SC
'1
A GPU-accelerated Direct-sum Boundary Integral Poisson-Boltzmann Solver
In this paper, we present a GPU-accelerated direct-sum boundary integral
method to solve the linear Poisson-Boltzmann (PB) equation. In our method, a
well-posed boundary integral formulation is used to ensure the fast convergence
of Krylov subspace based linear algebraic solver such as the GMRES. The
molecular surfaces are discretized with flat triangles and centroid
collocation. To speed up our method, we take advantage of the parallel nature
of the boundary integral formulation and parallelize the schemes within CUDA
shared memory architecture on GPU. The schemes use only
size-of-double device memory for a biomolecule with triangular surface
elements and partial charges. Numerical tests of these schemes show
well-maintained accuracy and fast convergence. The GPU implementation using one
GPU card (Nvidia Tesla M2070) achieves 120-150X speed-up to the implementation
using one CPU (Intel L5640 2.27GHz). With our approach, solving PB equations on
well-discretized molecular surfaces with up to 300,000 boundary elements will
take less than about 10 minutes, hence our approach is particularly suitable
for fast electrostatics computations on small to medium biomolecules
- …