1 research outputs found

    Scalable performance analysis of exascale MPI programs through signature-based clustering algorithms

    No full text
    Extreme-scale computing poses a number of challenges to application performance. Developers need to study appli-cation behavior by collecting detailed information with the help of tracing toolsets to determine shortcomings. But not only applications are“scalability challenged”, current tracing toolsets also fall short of exascale requirements for low back-ground overhead since trace collection for each execution en-tity is becoming infeasible. One effective solution is to clus-ter processes with the same behavior into groups. Instead of collecting performance information from each individual node, this information can be collected from just a set of representative nodes. This work contributes a fast, scalable, signature-based clustering algorithm that clusters processes exhibiting similar execution behavior. Instead of prior work based on statistical clustering, our approach produces pre-cise results nearly without loss of events or accuracy. The proposed algorithm combines low overhead at the clustering level with log(P) time complexity, and it splits the merge process to make tracing suitable for extreme-scale comput-ing. Overall, this multi-level precise clustering based on sig-natures further generalizes to a novel multi-metric clustering technique with unprecedented low overhead. Categories and Subject Descriptor
    corecore