Skip to main content
Article thumbnail
Location of Repository

Exploiting Parallelism in Matrix-Computation Kernels for Symmetric Multiprocessor Systems -- Matrix-Multiplication and Matrix-Addition Algorithm Optimizations by Software Pipelining and Threads Allocation

By Paolo D’Alberto, Marco Bodrato and Alexandru Nicolau

Abstract

We present a simple and efficient methodology for the development, tuning, and installation of matrix algorithms such as the hybrid Strassen’s and Winograd’s fast matrix multiply or their combination with the 3M algorithm for complex matrices (i.e., hybrid: a recursive algorithm as Strassen’s until a highly tuned BLAS matrix multiplication allows performance advantages). We investigate how modern symmetric multiprocessor (SMP) architectures present old and new challenges that can be addressed by the combination of an algorithm design with careful and natural parallelism exploitation at the function level (optimizations) such as function-call parallelism, function percolation, and function software pipelining. We have three contributions: first, we present a performance overview for double and double complex precision matrices for state-of-the-art SMP systems; second, we introduce new algorithm implementations: a variant of the 3M algorithm and two new different schedules of Winograd’s matrix multiplication (achieving up to 20 % speed up w.r.t. regular matrix multiplication). About the latter Winograd’s algorithms: one is designed to minimize the number of matrix additions and the other to minimize the computation latency of matrix additions; third, we apply software pipelining and threads allocation to all the algorithms and w

Topics: Categories and Subject Descriptors, G.4 [Mathematics of Computing, Mathematical Software, D.2.8 [Software Engineering, Metrics—complexity measures, performance measures, D.2.3 [Software Engineering, Coding Tools and Techniques—Top-down programming Additional Key Words and Phrases, Matrix Multiplications, Fast Algorithms, Software Pipeline, Parallelism ACM Reference Format
Year: 2011
OAI identifier: oai:CiteSeerX.psu:10.1.1.190.6034
Provided by: CiteSeerX
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://citeseerx.ist.psu.edu/v... (external link)
  • http://www.ics.uci.edu/%7Epaol... (external link)
  • Suggested articles


    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.