Search CORE

2 research outputs found

The Algorithms for FPGA Implementation of Sparse Matrices Multiplication

Author: Jamro Ernest
Pabiś Tomasz
Russek Paweł
Wiatr Kazimierz
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 04/02/2015
Field of study

In comparison to dense matrices multiplication, sparse matrices multiplication real performance for CPU is roughly 5--100 times lower when expressed in GFLOPs. For sparse matrices, microprocessors spend most of the time on comparing matrices indices rather than performing floating-point multiply and add operations. For 16-bit integer operations, like indices comparisons, computational power of the FPGA significantly surpasses that of CPU. Consequently, this paper presents a novel theoretical study how matrices sparsity factor influences the indices comparison to floating-point operation workload ratio. As a result, a novel FPGAs architecture for sparse matrix-matrix multiplication is presented for which indices comparison and floating-point operations are separated. We also verified our idea in practice, and the initial implementations results are very promising. To further decrease hardware resources required by the floating-point multiplier, a reduced width multiplication is proposed in the case when IEEE-754 standard compliance is not required

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)

On the usage of 16 bit indices in recursively stored sparse matrices

Author: Filippone S
Martone M
Paprzycki M
Tucci S
Publication venue: IEEE Computer Society
Publication date: 01/09/2010
Field of study

In our earlier work, we have investigated the feasibility of utilization of recursive partitioning in basic (BLAS oriented) sparse matrix computations, on multi-core cache-based computers. Following encouraging experimental results obtained for SpMV and SpSV operations, here we proceed to tune the storage format. To limit the memory bandwidth overhead we introduce usage of shorter (16 bit) indices in leaf sub matrices (at the end of the recursion). Experimental results obtained for the proposed approach on 8-core machines illustrate speed improvements, when performing sparse matrix-vector multiplication

Crossref

ART