30,718 research outputs found
Introducing Molly: Distributed Memory Parallelization with LLVM
Programming for distributed memory machines has always been a tedious task,
but necessary because compilers have not been sufficiently able to optimize for
such machines themselves. Molly is an extension to the LLVM compiler toolchain
that is able to distribute and reorganize workload and data if the program is
organized in statically determined loop control-flows. These are represented as
polyhedral integer-point sets that allow program transformations applied on
them. Memory distribution and layout can be declared by the programmer as
needed and the necessary asynchronous MPI communication is generated
automatically. The primary motivation is to run Lattice QCD simulations on IBM
Blue Gene/Q supercomputers, but since the implementation is not yet completed,
this paper shows the capabilities on Conway's Game of Life
Explicit memory schemes for evolutionary algorithms in dynamic environments
Copyright @ 2007 Springer-VerlagProblem optimization in dynamic environments has atrracted a growing interest from the evolutionary computation community in reccent years due to its importance in real world optimization problems. Several approaches have been developed to enhance the performance of evolutionary algorithms for dynamic optimization problems, of which the memory scheme is a major one. This chapter investigates the application of explicit memory schemes for evolutionary algorithms in dynamic environments. Two kinds of explicit memory schemes: direct memory and associative memory, are studied within two classes of evolutionary algorithms: genetic algorithms and univariate marginal distribution algorithms for dynamic optimization problems. Based on a series of systematically constructed dynamic test environments, experiments are carried out to investigate these explicit memory schemes and the performance of direct and associative memory schemes are campared and analysed. The experimental results show the efficiency of the memory schemes for evolutionary algorithms in dynamic environments, especially when the environment changes cyclically. The experimental results also indicate that the effect of the memory schemes depends not only on the dynamic problems and dynamic environments but also on the evolutionary algorithm used
Multicore-aware parallel temporal blocking of stencil codes for shared and distributed memory
New algorithms and optimization techniques are needed to balance the
accelerating trend towards bandwidth-starved multicore chips. It is well known
that the performance of stencil codes can be improved by temporal blocking,
lessening the pressure on the memory interface. We introduce a new pipelined
approach that makes explicit use of shared caches in multicore environments and
minimizes synchronization and boundary overhead. For clusters of shared-memory
nodes we demonstrate how temporal blocking can be employed successfully in a
hybrid shared/distributed-memory environment.Comment: 9 pages, 6 figure
- …