25 research outputs found

    Periscope: Advanced Techniques for Performance Analysis

    Get PDF

    John von Neumann Institute for Computing published in Periscope: Advanced Techniques for Performance Analysis

    No full text
    Permission to make digital or hard copies of portions of this work for personal or classroom use is granted provided that the copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise requires prior specific permission by the publisher mentioned above

    Early Performance Assessment of the ThunderX2 Processor for Lattice Based Simulations

    No full text
    This paper presents an early performance assessment of the ThunderX2, the most recent Arm-based multi-core processor designed for HPC applications. We use as benchmarks well known stencil-based LBM and LQCD algorithms, widely used to study respectively fluid flows, and interaction properties of elementary particles. We run benchmark kernels derived from OpenMP production codes, we measure performance as a function of the number of threads, and evaluate the impact of different choices for data layout. We then analyze our results in the framework of the roofline model, and compare with the performances measured on mainstream Intel Skylake processors. We find that these Arm based processors reach levels of performance competitive with those of other state-of-the-art options
    corecore