2 research outputs found

    The Algorithms for FPGA Implementation of Sparse Matrices Multiplication

    Get PDF
    In comparison to dense matrices multiplication, sparse matrices multiplication real performance for CPU is roughly 5--100 times lower when expressed in GFLOPs. For sparse matrices, microprocessors spend most of the time on comparing matrices indices rather than performing floating-point multiply and add operations. For 16-bit integer operations, like indices comparisons, computational power of the FPGA significantly surpasses that of CPU. Consequently, this paper presents a novel theoretical study how matrices sparsity factor influences the indices comparison to floating-point operation workload ratio. As a result, a novel FPGAs architecture for sparse matrix-matrix multiplication is presented for which indices comparison and floating-point operations are separated. We also verified our idea in practice, and the initial implementations results are very promising. To further decrease hardware resources required by the floating-point multiplier, a reduced width multiplication is proposed in the case when IEEE-754 standard compliance is not required

    On the usage of 16 bit indices in recursively stored sparse matrices

    No full text
    In our earlier work, we have investigated the feasibility of utilization of recursive partitioning in basic (BLAS oriented) sparse matrix computations, on multi-core cache-based computers. Following encouraging experimental results obtained for SpMV and SpSV operations, here we proceed to tune the storage format. To limit the memory bandwidth overhead we introduce usage of shorter (16 bit) indices in leaf sub matrices (at the end of the recursion). Experimental results obtained for the proposed approach on 8-core machines illustrate speed improvements, when performing sparse matrix-vector multiplication
    corecore