5 research outputs found

    Characteristics of Workloads Used in High Performance and Technical Computing

    No full text
    This paper provides a systematic comparison of various characteristics of computationally-intensive workloads. Our analysis focuses on standard HPC benchmarks and representative applications. For the selected workloads we provide a wide range of characterizations based on instruction tracing and hardware counter measurements. Each workload is analyzed at the instruction level by comparing the dynamic distribution of executed instructions. We also analyze memory access patterns including various aspects of cache utilization and locality properties of address distributions. Since prefetching plays an important role in the performance of computational workloads, we explore the prefetching potential and for parallel workloads we study the sharing properties of memory accesses. For the purpose of completeness, HPC workloads are compare

    Abstract A Case Study in Top-Down Performance Estimation for a Large-Scale Parallel Application ∗

    No full text
    This work presents a general methodology for estimating the performance of an HPC workload when running on a future hardware architecture. Further, it demonstrates the methodology by estimating the performance of a significant scientific application — the Gyrokinetic Toroidal Code (GTC) — when executing on Sun’s proposed next-generation petascale computer architecture. For GTC, we identify the important phases of the iteration and perform low-level analysis that includes instruction tracing and component simulations of processor and memory systems. Lowlevel analysis is complemented with scalability estimates based on modeling MPI, OpenMP and I/O activity in the code. The work’s approach permits accurate end-to-end performance projections from the microarchitecture level to the petascale. Categories and Subject Descriptors C.4 [Performance of systems]
    corecore