5 research outputs found
Automatic Loop Kernel Analysis and Performance Modeling With Kerncraft
Analytic performance models are essential for understanding the performance
characteristics of loop kernels, which consume a major part of CPU cycles in
computational science. Starting from a validated performance model one can
infer the relevant hardware bottlenecks and promising optimization
opportunities. Unfortunately, analytic performance modeling is often tedious
even for experienced developers since it requires in-depth knowledge about the
hardware and how it interacts with the software. We present the "Kerncraft"
tool, which eases the construction of analytic performance models for streaming
kernels and stencil loop nests. Starting from the loop source code, the problem
size, and a description of the underlying hardware, Kerncraft can ideally
predict the single-core performance and scaling behavior of loops on multicore
processors using the Roofline or the Execution-Cache-Memory (ECM) model. We
describe the operating principles of Kerncraft with its capabilities and
limitations, and we show how it may be used to quickly gain insights by
accelerated analytic modeling.Comment: 11 pages, 4 figures, 8 listing