36 research outputs found

    Multilayered Heterogeneous Parallelism Applied to Atmospheric Constituent Transport Simulation

    Get PDF
    Heterogeneous multicore chipsets with many levels of parallelism are becoming increasingly common in high-performance computing systems. Effective use of parallelism in these new chipsets constitutes the challenge facing a new generation of large scale scientific computing applications. This study examines methods for improving the performance of two-dimensional and three-dimensional atmospheric constituent transport simulation on the Cell Broadband Engine Architecture (CBEA). A function offloading approach is used in a 2D transport module, and a vector stream processing approach is used in a 3D transport module. Two methods for transferring incontiguous data between main memory and accelerator local storage are compared. By leveraging the heterogeneous parallelism of the CBEA, the 3D transport module achieves performance comparable to two nodes of an IBM BlueGene/P, or eight Intel Xeon cores, on a single PowerXCell 8i chip. Module performance on two CBEA systems, an IBM BlueGene/P, and an eight-core shared-memory Intel Xeon workstation are given

    Investigation of hadron matter using lattice QCD and implementation of lattice QCD applications on heterogeneous multicore acceleration processors

    Get PDF
    Observables relevant for the understanding of the structure of baryons were determined by means of Monte Carlo simulations of Lattice Quantum Chromodynamics (QCD) using 2+1 dynamical quark flavours. Especial emphasis was placed on how these observables change when flavour symmetry is broken in comparison to choosing equal masses for the two light and the strange quark. The first two moments of unpolarised, longitudinally, and transversely polarised parton distribution functions were calculated for the nucleon and hyperons. The latter are baryons which comprise a strange quark. Lattice QCD simulations tend to be extremely expensive, reaching the need for petaflop computing and beyond, a regime of computing power we just reach today. Heterogeneous multicore computing is getting increasingly important in high performance scientific computing. The strategy of deploying multiple types of processing elements within a single workflow, and allowing each to perform the tasks to which it is best suited is likely to be part of the roadmap to exascale. In this work new design concepts were developed for an active library (QDP++) harnessing the compute power of a heterogeneous multicore processor (IBM PowerXCell 8i processor). Not only a proof-of-concept is given furthermore it was possible to run a QDP++ based physics application (Chroma) achieving a reasonable performance on the IBM BladeCenter QS22

    The QPACE Supercomputer : Applications of Random Matrix Theory in Two-Colour Quantum Chromodynamics

    Get PDF
    QPACE is a massively parallel and scalable supercomputer designed to meet the requirements of applications in Lattice Quantum Chromodynamics. The project was carried out by several academic institutions in collaboration with IBM Germany and other industrial partners. In November 2009 and June 2010 QPACE was the leading architecture on the Green 500 list of the most energy efficient supercomputers in the world

    Solving Hyperbolic PDEs using Accelerator Architectures

    Get PDF
    Accelerator architectures are used to accelerate the simulation of nonlinear hyperbolic PDEs. Three different architectures, a multicore CPU using threading, IBM’s Cell Processor, and Nvidia’s Tesla GPUs are investigated. Speed-ups of between 40-75× relative to a single CPU core in single precision are obtained using the Cell processor and the GPU. The three implementations are extended to parallel computing clusters by making use of the Message Passing Interface (MPI). The resulting hybrid-parallel code is investigated for performance and scalability on both a GPU and Cell computing cluster
    corecore