204,094 research outputs found
Parallel transfer evolution algorithm
Parallelization of an evolutionary algorithm takes the advantage of modular population division and information exchange among multiple processors. However, existing parallel evolutionary algorithms are rather ad hoc and lack a capability of adapting to diverse problems. To accommodate a wider range of problems and to reduce algorithm design costs, this paper develops a parallel transfer evolution algorithm. It is based on the island-model of parallel evolutionary algorithm and, for improving performance, transfers both the connections and the evolutionary operators from one sub-population pair to another adaptively. Needing no extra upper selection strategy, each sub-population is able to select autonomously evolutionary operators and local search operators as subroutines according to both the sub-population's own and the connected neighbor's ranking boards. The parallel transfer evolution is tested on two typical combinatorial optimization problems in comparison with six existing ad-hoc evolutionary algorithms, and is also applied to a real-world case study in comparison with five typical parallel evolutionary algorithms. The tests show that the proposed scheme and the resultant PEA offer high flexibility in dealing with a wider range of combinatorial optimization problems without algorithmic modification or redesign. Both the topological transfer and the algorithmic transfer are seen applicable not only to combinatorial optimization problems, but also to non-permutated complex problems
Genetic Transfer or Population Diversification? Deciphering the Secret Ingredients of Evolutionary Multitask Optimization
Evolutionary multitasking has recently emerged as a novel paradigm that
enables the similarities and/or latent complementarities (if present) between
distinct optimization tasks to be exploited in an autonomous manner simply by
solving them together with a unified solution representation scheme. An
important matter underpinning future algorithmic advancements is to develop a
better understanding of the driving force behind successful multitask
problem-solving. In this regard, two (seemingly disparate) ideas have been put
forward, namely, (a) implicit genetic transfer as the key ingredient
facilitating the exchange of high-quality genetic material across tasks, and
(b) population diversification resulting in effective global search of the
unified search space encompassing all tasks. In this paper, we present some
empirical results that provide a clearer picture of the relationship between
the two aforementioned propositions. For the numerical experiments we make use
of Sudoku puzzles as case studies, mainly because of their feature that
outwardly unlike puzzle statements can often have nearly identical final
solutions. The experiments reveal that while on many occasions genetic transfer
and population diversity may be viewed as two sides of the same coin, the wider
implication of genetic transfer, as shall be shown herein, captures the true
essence of evolutionary multitasking to the fullest.Comment: 7 pages, 6 figure
Performance analysis of direct N-body algorithms for astrophysical simulations on distributed systems
We discuss the performance of direct summation codes used in the simulation
of astrophysical stellar systems on highly distributed architectures. These
codes compute the gravitational interaction among stars in an exact way and
have an O(N^2) scaling with the number of particles. They can be applied to a
variety of astrophysical problems, like the evolution of star clusters, the
dynamics of black holes, the formation of planetary systems, and cosmological
simulations. The simulation of realistic star clusters with sufficiently high
accuracy cannot be performed on a single workstation but may be possible on
parallel computers or grids. We have implemented two parallel schemes for a
direct N-body code and we study their performance on general purpose parallel
computers and large computational grids. We present the results of timing
analyzes conducted on the different architectures and compare them with the
predictions from theoretical models. We conclude that the simulation of star
clusters with up to a million particles will be possible on large distributed
computers in the next decade. Simulating entire galaxies however will in
addition require new hybrid methods to speedup the calculation.Comment: 22 pages, 8 figures, accepted for publication in Parallel Computin
A Parallel Monte Carlo Code for Simulating Collisional N-body Systems
We present a new parallel code for computing the dynamical evolution of
collisional N-body systems with up to N~10^7 particles. Our code is based on
the the Henon Monte Carlo method for solving the Fokker-Planck equation, and
makes assumptions of spherical symmetry and dynamical equilibrium. The
principal algorithmic developments involve optimizing data structures, and the
introduction of a parallel random number generation scheme, as well as a
parallel sorting algorithm, required to find nearest neighbors for interactions
and to compute the gravitational potential. The new algorithms we introduce
along with our choice of decomposition scheme minimize communication costs and
ensure optimal distribution of data and workload among the processing units.
The implementation uses the Message Passing Interface (MPI) library for
communication, which makes it portable to many different supercomputing
architectures. We validate the code by calculating the evolution of clusters
with initial Plummer distribution functions up to core collapse with the number
of stars, N, spanning three orders of magnitude, from 10^5 to 10^7. We find
that our results are in good agreement with self-similar core-collapse
solutions, and the core collapse times generally agree with expectations from
the literature. Also, we observe good total energy conservation, within less
than 0.04% throughout all simulations. We analyze the performance of the code,
and demonstrate near-linear scaling of the runtime with the number of
processors up to 64 processors for N=10^5, 128 for N=10^6 and 256 for N=10^7.
The runtime reaches a saturation with the addition of more processors beyond
these limits which is a characteristic of the parallel sorting algorithm. The
resulting maximum speedups we achieve are approximately 60x, 100x, and 220x,
respectively.Comment: 53 pages, 13 figures, accepted for publication in ApJ Supplement
- …