An Application-Specific Microbenchmark for Memory Access by Lakshminarasimhan, Mahesh
Boise State University
ScholarWorks
2018 Graduate Student Showcase Conferences
4-30-2018
An Application-Specific Microbenchmark for
Memory Access
Mahesh Lakshminarasimhan
An Application-Specific Microbenchmark for Memory Access
Mahesh Lakshminarasimhan, Dr. Catherine Olschanowsky
Department of Computer Science
VARIATION  IN CUMULATIVE MEMORY BANDWIDTH 
DUE TO MEMORY ACCESS PATTERNS
MOTIVATION
Benchmarking High Performance Computing Systems
is essential to maximize scientific application
performance. We present a configurable
microbenchmark to advise memory optimization
strategies for domain scientists.
APPROACH
We explore the variation in memory bandwidth with
working data set sizes across different levels of
memory hierarchy for synthetic and realistic memory
access patterns with different thread configurations.
OBSERVATIONS
• Access patterns matter: The synthetic
triad outperforms the more realistic
multidimensional stencil patterns.
• Stencil operations, especially two
dimensional stencil, struggle to scale
across all levels of memory hierarchy.
• Some working set sizes are problematic
and consistently yield low performance
for different access patterns.
• Variation in performance increases
when the working set sizes exit cache .
MEMORY HIERARCHY IN MODERN COMPUTERS
PERFORMANCE VARIATION AMONG THREADS – TRIAD
We examine the variation in the memory bandwidth among each thread for 14 and 
28 cores on executing the synthetic triad operation.
CUSTOM BENCHMARK
1. THE SYNTHETIC TRIAD KERNEL
2. ONE-DIMENSIONAL STENCIL OPERATION
CONFIGURATION <triad_run.c>
3. TWO-DIMENSIONAL STENCIL OPERATION
BENCHMARK TEMPLATE
(a) In L1/L2 Cache
(a) In L1/L2 Cache (a) In L1/L2 Cache
(b) In Main Memory
(b) In Main Memory(b) In Main Memory
