65,003 research outputs found
Idempotent permutations
Together with a characteristic function, idempotent permutations uniquely
determine idempotent maps, as well as their linearly ordered arrangement
simultaneously. Furthermore, in-place linear time transformations are possible
between them. Hence, they may be important for succinct data structures,
information storing, sorting and searching.
In this study, their combinatorial interpretation is given and their
application on sorting is examined. Given an array of n integer keys each in
[1,n], if it is allowed to modify the keys in the range [-n,n], idempotent
permutations make it possible to obtain linearly ordered arrangement of the
keys in O(n) time using only 4log(n) bits, setting the theoretical lower bound
of time and space complexity of sorting. If it is not allowed to modify the
keys out of the range [1,n], then n+4log(n) bits are required where n of them
is used to tag some of the keys.Comment: 32 page
Improved neighbor list algorithm in molecular simulations using cell decomposition and data sorting method
An improved neighbor list algorithm is proposed to reduce unnecessary
interatomic distance calculations in molecular simulations. It combines the
advantages of Verlet table and cell linked list algorithms by using cell
decomposition approach to accelerate the neighbor list construction speed, and
data sorting method to lower the CPU data cache miss rate, as well as partial
updating method to minimize the unnecessary reconstruction of the neighbor
list. Both serial and parallel performance of molecular dynamics simulation are
evaluated using the proposed algorithm and compared with those using
conventional Verlet table and cell linked list algorithms. Results show that
the new algorithm outperforms the conventional algorithms by a factor of 2~3 in
cases of both small and large number of atoms.Comment: 14 pages, 7 figures. Submitted to Computer Physics Communication
Efficient Implementations of Molecular Dynamics Simulations for Lennard-Jones Systems
Efficient implementations of the classical molecular dynamics (MD) method for
Lennard-Jones particle systems are considered. Not only general algorithms but
also techniques that are efficient for some specific CPU architectures are also
explained. A simple spatial-decomposition-based strategy is adopted for
parallelization. By utilizing the developed code, benchmark simulations are
performed on a HITACHI SR16000/J2 system consisting of IBM POWER6 processors
which are 4.7 GHz at the National Institute for Fusion Science (NIFS) and an
SGI Altix ICE 8400EX system consisting of Intel Xeon processors which are 2.93
GHz at the Institute for Solid State Physics (ISSP), the University of Tokyo.
The parallelization efficiency of the largest run, consisting of 4.1 billion
particles with 8192 MPI processes, is about 73% relative to that of the
smallest run with 128 MPI processes at NIFS, and it is about 66% relative to
that of the smallest run with 4 MPI processes at ISSP. The factors causing the
parallel overhead are investigated. It is found that fluctuations of the
execution time of each process degrade the parallel efficiency. These
fluctuations may be due to the interference of the operating system, which is
known as OS Jitter.Comment: 33 pages, 19 figures, add references and figures are revise
- …