8,196 research outputs found

    Performance benchmarks for a next generation numerical dynamo model

    Get PDF
    Numerical simulations of the geodynamo have successfully represented many observable characteristics of the geomagnetic field, yielding insight into the fundamental processes that generate magnetic fields in the Earth's core. Because of limited spatial resolution, however, the diffusivities in numerical dynamo models are much larger than those in the Earth's core, and consequently, questions remain about how realistic these models are. The typical strategy used to address this issue has been to continue to increase the resolution of these quasi-laminar models with increasing computational resources, thus pushing them toward more realistic parameter regimes. We assess which methods are most promising for the next generation of supercomputers, which will offer access to O(106) processor cores for large problems. Here we report performance and accuracy benchmarks from 15 dynamo codes that employ a range of numerical and parallelization methods. Computational performance is assessed on the basis of weak and strong scaling behavior up to 16,384 processor cores. Extrapolations of our weak-scaling results indicate that dynamo codes that employ two-dimensional or three-dimensional domain decompositions can perform efficiently on up to ∼106 processor cores, paving the way for more realistic simulations in the next model generation

    Exploring Application Performance on Emerging Hybrid-Memory Supercomputers

    Full text link
    Next-generation supercomputers will feature more hierarchical and heterogeneous memory systems with different memory technologies working side-by-side. A critical question is whether at large scale existing HPC applications and emerging data-analytics workloads will have performance improvement or degradation on these systems. We propose a systematic and fair methodology to identify the trend of application performance on emerging hybrid-memory systems. We model the memory system of next-generation supercomputers as a combination of "fast" and "slow" memories. We then analyze performance and dynamic execution characteristics of a variety of workloads, from traditional scientific applications to emerging data analytics to compare traditional and hybrid-memory systems. Our results show that data analytics applications can clearly benefit from the new system design, especially at large scale. Moreover, hybrid-memory systems do not penalize traditional scientific applications, which may also show performance improvement.Comment: 18th International Conference on High Performance Computing and Communications, IEEE, 201

    MADmap: A Massively Parallel Maximum-Likelihood Cosmic Microwave Background Map-Maker

    Full text link
    MADmap is a software application used to produce maximum-likelihood images of the sky from time-ordered data which include correlated noise, such as those gathered by Cosmic Microwave Background (CMB) experiments. It works efficiently on platforms ranging from small workstations to the most massively parallel supercomputers. Map-making is a critical step in the analysis of all CMB data sets, and the maximum-likelihood approach is the most accurate and widely applicable algorithm; however, it is a computationally challenging task. This challenge will only increase with the next generation of ground-based, balloon-borne and satellite CMB polarization experiments. The faintness of the B-mode signal that these experiments seek to measure requires them to gather enormous data sets. MADmap is already being run on up to O(1011)O(10^{11}) time samples, O(108)O(10^8) pixels and O(104)O(10^4) cores, with ongoing work to scale to the next generation of data sets and supercomputers. We describe MADmap's algorithm based around a preconditioned conjugate gradient solver, fast Fourier transforms and sparse matrix operations. We highlight MADmap's ability to address problems typically encountered in the analysis of realistic CMB data sets and describe its application to simulations of the Planck and EBEX experiments. The massively parallel and distributed implementation is detailed and scaling complexities are given for the resources required. MADmap is capable of analysing the largest data sets now being collected on computing resources currently available, and we argue that, given Moore's Law, MADmap will be capable of reducing the most massive projected data sets

    Simulating the weak death of the neutron in a femtoscale universe with near-Exascale computing

    Full text link
    The fundamental particle theory called Quantum Chromodynamics (QCD) dictates everything about protons and neutrons, from their intrinsic properties to interactions that bind them into atomic nuclei. Quantities that cannot be fully resolved through experiment, such as the neutron lifetime (whose precise value is important for the existence of light-atomic elements that make the sun shine and life possible), may be understood through numerical solutions to QCD. We directly solve QCD using Lattice Gauge Theory and calculate nuclear observables such as neutron lifetime. We have developed an improved algorithm that exponentially decreases the time-to solution and applied it on the new CORAL supercomputers, Sierra and Summit. We use run-time autotuning to distribute GPU resources, achieving 20% performance at low node count. We also developed optimal application mapping through a job manager, which allows CPU and GPU jobs to be interleaved, yielding 15% of peak performance when deployed across large fractions of CORAL.Comment: 2018 Gordon Bell Finalist: 9 pages, 9 figures; v2: fixed 2 typos and appended acknowledgement
    corecore