2 research outputs found

    Parallel Sparse Matrix-Matrix Multiplication

    Get PDF
    The thesis investigates the BLAS-3 routine of sparse matrix-matrix multiplication (SpGEMM) based on the outer product method. Sev- eral algorithmic approaches have been implemented and empirically an- alyzed. The experiments have shown that an algorithm presented by Gustavson [22] outperforms other alternatives. In this work we propose optimization techniques that improve the scalability and the cache efficiency of the Gustavson’s algorithm for large matrices. Our approach succeeded to reduce the cache misses by more than a factor of five and to improve the net running time by 30% with some instances. The thesis also presents an algorithm for flops estima- tion, which can be used to determine an upper bound for the density of the result matrix. Furthermore, the work analyzes and empirically evaluates techniques for parallelization of the multiplication in a shared memory model by using Intel TBB and OpenMP. We investigate the cache efficiency of the algorithm in a parallel setting and compare several approaches for load balancing of the computation
    corecore