111,952 research outputs found
Blooming Trees: Substructures and Surrounding Groups of Galaxy Clusters
We develop the Blooming Tree Algorithm, a new technique that uses
spectroscopic redshift data alone to identify the substructures and the
surrounding groups of galaxy clusters, along with their member galaxies. Based
on the estimated binding energy of galaxy pairs, the algorithm builds a binary
tree that hierarchically arranges all the galaxies in the field of view. The
algorithm searches for buds, corresponding to gravitational potential minima on
the binary tree branches; for each bud, the algorithm combines the number of
galaxies, their velocity dispersion and their average pairwise distance into a
parameter that discriminates between the buds that do not correspond to any
substructure or group, and thus eventually die, and the buds that correspond to
substructures and groups, and thus bloom into the identified structures. We
test our new algorithm with a sample of 300 mock redshift surveys of clusters
in different dynamical states; the clusters are extracted from a large
cosmological -body simulation of a CDM model. We limit our analysis
to substructures and surrounding groups identified in the simulation with mass
larger than . With mock redshift surveys with 200
galaxies within 6 ~Mpc from the cluster center, the technique recovers
\% of the real substructures and \% of the surrounding groups; in
\% of the identified structures, at least 60\% of the member galaxies of
the substructures and groups belong to the same real structure. These results
improve by roughly a factor of two the performance of the best substructure
identification algorithm currently available, the plateau algorithm,
and suggest that our Blooming Tree Algorithm can be an invaluable tool for
detecting substructures of galaxy clusters and investigating their complex
dynamics.Comment: 17 pages, 20 figures, accepted by Ap
4.45 Pflops Astrophysical N-Body Simulation on K computer -- The Gravitational Trillion-Body Problem
As an entry for the 2012 Gordon-Bell performance prize, we report performance
results of astrophysical N-body simulations of one trillion particles performed
on the full system of K computer. This is the first gravitational trillion-body
simulation in the world. We describe the scientific motivation, the numerical
algorithm, the parallelization strategy, and the performance analysis. Unlike
many previous Gordon-Bell prize winners that used the tree algorithm for
astrophysical N-body simulations, we used the hybrid TreePM method, for similar
level of accuracy in which the short-range force is calculated by the tree
algorithm, and the long-range force is solved by the particle-mesh algorithm.
We developed a highly-tuned gravity kernel for short-range forces, and a novel
communication algorithm for long-range forces. The average performance on 24576
and 82944 nodes of K computer are 1.53 and 4.45 Pflops, which correspond to 49%
and 42% of the peak speed.Comment: 10 pages, 6 figures, Proceedings of Supercomputing 2012
(http://sc12.supercomputing.org/), Gordon Bell Prize Winner. Additional
information is http://www.ccs.tsukuba.ac.jp/CCS/eng/gbp201
A sparse octree gravitational N-body code that runs entirely on the GPU processor
We present parallel algorithms for constructing and traversing sparse octrees
on graphics processing units (GPUs). The algorithms are based on parallel-scan
and sort methods. To test the performance and feasibility, we implemented them
in CUDA in the form of a gravitational tree-code which completely runs on the
GPU.(The code is publicly available at:
http://castle.strw.leidenuniv.nl/software.html) The tree construction and
traverse algorithms are portable to many-core devices which have support for
CUDA or OpenCL programming languages. The gravitational tree-code outperforms
tuned CPU code during the tree-construction and shows a performance improvement
of more than a factor 20 overall, resulting in a processing rate of more than
2.8 million particles per second.Comment: Accepted version. Published in Journal of Computational Physics. 35
pages, 12 figures, single colum
Recommended from our members
Applying an abstract data structure description approach to parallelizing scientific pointer programs
Even though impressive progress has been made in the area of parallelizing scientific programs with arrays, the application of similar techniques to programs with pointer data structures has remained difficult. Unlike arrays which have a small number of well-defined properties that can be utilized by a parallelizing compiler, pointer data structures are used to implement a wide variety of structures that exhibit a much more diverse set of properties. The complexity and diversity of such properties means that, in general, scientific programs with pointer data structures cannot be effectively analyzed by an optimizing and parallelizing compiler.In order to provide a system in which the compiler can fully utilize the properties of different types of pointer data structures, we have developed a mechanism for the Abstract Description of Data Structures (ADDS). With our approach, the programmer can explicitly describe important properties such as dimensionality of the pointer data structure, independence of dimensions, and direction of traversal. These abstract descriptions of pointer data structures are then used by the compiler to guide analysis, optimization, and parallelization.In this paper we summarize the ADDS approach through the use of numerous examples of data structures used in scientific computations, we illustrate how such declarations are natural and non-tedious to specify, and we show how the ADDS declarations can be used to improve compile-time analysis. In order to demonstrate the viability of our approach, we show how such techniques can be used to parallelize an important class of scientific codes which naturally use recursive pointer data structures. In particular, we use our approach to develop the parallelization of an N-body simulation that is based on a relatively complicated pointer data structure, and we report the speedup results for a Sequent multiprocessor
The Coarse Geometry of Merger Trees in \Lambda CDM
We introduce the contour process to describe the geometrical properties of
merger trees. The contour process produces a one-dimensional object, the
contour walk, which is a translation of the merger tree. We portray the contour
walk through its length and action. The length is proportional to to the number
of progenitors in the tree, and the action can be interpreted as a proxy of the
mean length of a branch in a merger tree.
We obtain the contour walk for merger trees extracted from the public
database of the Millennium Run and also for merger trees constructed with a
public Monte-Carlo code which implements a Markovian algorithm. The trees
correspond to halos of final masses between 10^{11} h^{-1} M_sol and 10^{14}
h^{-1} M_sol. We study how the length and action of the walks evolve with the
mass of the final halo. In all the cases, except for the action measured from
Markovian trees, we find a transitional scale around 3 \times 10^{12} h^{-1}
M_sol. As a general trend the length and action measured from the Markovian
trees show a large scatter in comparison with the case of the Millennium Run
trees.Comment: 7 pages, 5 figures, submitted to MNRA
Building Merger Trees from Cosmological N-body Simulations
Although a fair amount of work has been devoted to growing Monte-Carlo merger
trees which resemble those built from an N-body simulation, comparatively
little effort has been invested in quantifying the caveats one necessarily
encounters when one extracts trees directly from such a simulation. To somewhat
revert the tide, this paper seeks to provide its reader with a comprehensive
study of the problems one faces when following this route. The first step to
building merger histories of dark matter haloes and their subhaloes is to
identify these structures in each of the time outputs (snapshots) produced by
the simulation. Even though we discuss a particular implementation of such an
algorithm (called AdaptaHOP) in this paper, we believe that our results do not
depend on the exact details of the implementation but extend to most if not all
(sub)structure finders. We then highlight different ways to build merger
histories from AdaptaHOP haloes and subhaloes, contrasting their various
advantages and drawbacks. We find that the best approach to (sub)halo merging
histories is through an analysis that goes back and forth between
identification and tree building rather than one which conducts a
straightforward sequential treatment of these two steps. This is rooted in the
complexity of the merging trees which have to depict an inherently dynamical
process from the partial temporal information contained in the collection of
instantaneous snapshots available from the N-body simulation.Comment: 19 pages, 28 figure
A Modified TreePM Code
We discuss the performance characteristics of using the modification of the
tree code suggested by Barnes \citep{1990JCoPh..87..161B} in the context of the
TreePM code. The optimisation involves identifying groups of particles and
using only one tree walk to compute force for all the particles in the group.
This modification has been in use in our implementation of the TreePM code for
some time, and has also been used by others in codes that make use of tree
structures. In this paper, we present the first detailed study of the
performance characteristics of this optimisation. We show that the
modification, if tuned properly can speed up the TreePM code by a significant
amount. We also combine this modification with the use of individual time steps
and indicate how to combine these two schemes in an optimal fashion. We find
that the combination is at least a factor of two faster than the modified
TreePM without individual time steps. Overall performance is often faster by a
larger factor, as the scheme of groups optimises use of cache for large
simulations.Comment: 16 pages, 5 figures; Accepted for publication in Research In
Astronomy and Astrophysics (RAA
- …