4 research outputs found

    Quantifying the Effect of Matrix Structure on Multithreaded Performance of the SpMV Kernel

    Full text link
    Sparse matrix-vector multiplication (SpMV) is the core operation in many common network and graph analytics, but poor performance of the SpMV kernel handicaps these applications. This work quantifies the effect of matrix structure on SpMV performance, using Intel's VTune tool for the Sandy Bridge architecture. Two types of sparse matrices are considered: finite difference (FD) matrices, which are structured, and R-MAT matrices, which are unstructured. Analysis of cache behavior and prefetcher activity reveals that the SpMV kernel performs far worse with R-MAT matrices than with FD matrices, due to the difference in matrix structure. To address the problems caused by unstructured matrices, novel architecture improvements are proposed.Comment: 6 pages, 7 figures. IEEE HPEC 201

    embedded DRAM,

    No full text
    cache hierarchy, pageable memory © Copyright Hewlett-Packard Company 2000 Recent architectures in academia and industry have explored placing multiple processors on a single chip, but a consensus has not emerged on the memory architecture. The recent availability of embedded DRAM (EDRAM) has further complicated the formula. In this investigation, we present a new and comprehensive comparison of four very different memory technologies in the same framework: SRAM cache, SRAM configured as pageable memory, EDRAM configured as cache, and EDRAM configured as pageable memory. In addition, these experiments investigate tradeoffs between two levels of on-chip memory, given constant silicon area: as the level one capacity increases, the level two capacity decreases. Having four processors on a single die, each with its own set of level one caches, helps exaggerate the effective memory tradeoffs
    corecore