4 research outputs found
Quantifying the Effect of Matrix Structure on Multithreaded Performance of the SpMV Kernel
Sparse matrix-vector multiplication (SpMV) is the core operation in many
common network and graph analytics, but poor performance of the SpMV kernel
handicaps these applications. This work quantifies the effect of matrix
structure on SpMV performance, using Intel's VTune tool for the Sandy Bridge
architecture. Two types of sparse matrices are considered: finite difference
(FD) matrices, which are structured, and R-MAT matrices, which are
unstructured. Analysis of cache behavior and prefetcher activity reveals that
the SpMV kernel performs far worse with R-MAT matrices than with FD matrices,
due to the difference in matrix structure. To address the problems caused by
unstructured matrices, novel architecture improvements are proposed.Comment: 6 pages, 7 figures. IEEE HPEC 201
embedded DRAM,
cache hierarchy, pageable memory © Copyright Hewlett-Packard Company 2000 Recent architectures in academia and industry have explored placing multiple processors on a single chip, but a consensus has not emerged on the memory architecture. The recent availability of embedded DRAM (EDRAM) has further complicated the formula. In this investigation, we present a new and comprehensive comparison of four very different memory technologies in the same framework: SRAM cache, SRAM configured as pageable memory, EDRAM configured as cache, and EDRAM configured as pageable memory. In addition, these experiments investigate tradeoffs between two levels of on-chip memory, given constant silicon area: as the level one capacity increases, the level two capacity decreases. Having four processors on a single die, each with its own set of level one caches, helps exaggerate the effective memory tradeoffs