30,718 research outputs found

    Introducing Molly: Distributed Memory Parallelization with LLVM

    Get PDF
    Programming for distributed memory machines has always been a tedious task, but necessary because compilers have not been sufficiently able to optimize for such machines themselves. Molly is an extension to the LLVM compiler toolchain that is able to distribute and reorganize workload and data if the program is organized in statically determined loop control-flows. These are represented as polyhedral integer-point sets that allow program transformations applied on them. Memory distribution and layout can be declared by the programmer as needed and the necessary asynchronous MPI communication is generated automatically. The primary motivation is to run Lattice QCD simulations on IBM Blue Gene/Q supercomputers, but since the implementation is not yet completed, this paper shows the capabilities on Conway's Game of Life

    Explicit memory schemes for evolutionary algorithms in dynamic environments

    Get PDF
    Copyright @ 2007 Springer-VerlagProblem optimization in dynamic environments has atrracted a growing interest from the evolutionary computation community in reccent years due to its importance in real world optimization problems. Several approaches have been developed to enhance the performance of evolutionary algorithms for dynamic optimization problems, of which the memory scheme is a major one. This chapter investigates the application of explicit memory schemes for evolutionary algorithms in dynamic environments. Two kinds of explicit memory schemes: direct memory and associative memory, are studied within two classes of evolutionary algorithms: genetic algorithms and univariate marginal distribution algorithms for dynamic optimization problems. Based on a series of systematically constructed dynamic test environments, experiments are carried out to investigate these explicit memory schemes and the performance of direct and associative memory schemes are campared and analysed. The experimental results show the efficiency of the memory schemes for evolutionary algorithms in dynamic environments, especially when the environment changes cyclically. The experimental results also indicate that the effect of the memory schemes depends not only on the dynamic problems and dynamic environments but also on the evolutionary algorithm used

    Multicore-aware parallel temporal blocking of stencil codes for shared and distributed memory

    Full text link
    New algorithms and optimization techniques are needed to balance the accelerating trend towards bandwidth-starved multicore chips. It is well known that the performance of stencil codes can be improved by temporal blocking, lessening the pressure on the memory interface. We introduce a new pipelined approach that makes explicit use of shared caches in multicore environments and minimizes synchronization and boundary overhead. For clusters of shared-memory nodes we demonstrate how temporal blocking can be employed successfully in a hybrid shared/distributed-memory environment.Comment: 9 pages, 6 figure
    corecore