Search CORE

304,981 research outputs found

The border support rank of two-by-two matrix multiplication is seven

Author: Bläser Markus
Christandl Matthias
Zuiddam Jeroen
Publication venue: 'Theory of Computing Exchange'
Publication date: 26/05/2017
Field of study

We show that the border support rank of the tensor corresponding to two-by-two matrix multiplication is seven over the complex numbers. We do this by constructing two polynomials that vanish on all complex tensors with format four-by-four-by-four and border rank at most six, but that do not vanish simultaneously on any tensor with the same support as the two-by-two matrix multiplication tensor. This extends the work of Hauenstein, Ikenmeyer, and Landsberg. We also give two proofs that the support rank of the two-by-two matrix multiplication tensor is seven over any field: one proof using a result of De Groote saying that the decomposition of this tensor is unique up to sandwiching, and another proof via the substitution method. These results answer a question asked by Cohn and Umans. Studying the border support rank of the matrix multiplication tensor is relevant for the design of matrix multiplication algorithms, because upper bounds on the border support rank of the matrix multiplication tensor lead to upper bounds on the computational complexity of matrix multiplication, via a construction of Cohn and Umans. Moreover, support rank has applications in quantum communication complexity

arXiv.org e-Print Archive

CWI's Institutional Repository

Copenhagen University Research Information System

Strong Scaling of Matrix Multiplication Algorithms and Memory-Independent Communication Lower Bounds

Author: Ballard Grey
Demmel James
Holtz Olga
Lipshitz Benjamin
Schwartz Oded
Publication venue
Publication date: 01/01/2012
Field of study

A parallel algorithm has perfect strong scaling if its running time on P processors is linear in 1/P, including all communication costs. Distributed-memory parallel algorithms for matrix multiplication with perfect strong scaling have only recently been found. One is based on classical matrix multiplication (Solomonik and Demmel, 2011), and one is based on Strassen's fast matrix multiplication (Ballard, Demmel, Holtz, Lipshitz, and Schwartz, 2012). Both algorithms scale perfectly, but only up to some number of processors where the inter-processor communication no longer scales. We obtain a memory-independent communication cost lower bound on classical and Strassen-based distributed-memory matrix multiplication algorithms. These bounds imply that no classical or Strassen-based parallel matrix multiplication algorithm can strongly scale perfectly beyond the ranges already attained by the two parallel algorithms mentioned above. The memory-independent bounds and the strong scaling bounds generalize to other algorithms.Comment: 4 pages, 1 figur

arXiv.org e-Print Archive

CiteSeerX

Optical implementation of systolic array processing

Author: Caulfield H. J.
Foster M. J.
Horvitz S.
Rhodes W. T.
Publication venue
Publication date
Field of study

Algorithms for matrix vector multiplication are implemented using acousto-optic cells for multiplication and input data transfer and using charge coupled devices detector arrays for accumulation and output of the results. No two dimensional matrix mask is required; matrix changes are implemented electronically. A system for multiplying a 50 component nonnegative real vector by a 50 by 50 nonnegative real matrix is described. Modifications for bipolar real and complex valued processing are possible, as are extensions to matrix-matrix multiplication and multiplication of a vector by multiple matrices

NASA Technical Reports Server

Fast QMC matrix-vector multiplication

Author: Dick Josef
Gia Quoc T. Le
Kuo Frances Y.
Schwab Christoph
Publication venue
Publication date: 01/01/2015
Field of study

Quasi-Monte Carlo (QMC) rules

1/N \sum_{n=0}^{N-1} f(\boldsymbol{y}_n A)

can be used to approximate integrals of the form

\int_{[0,1]^s} f(\boldsymbol{y} A) \,\mathrm{d} \boldsymbol{y}

, where

A

is a matrix and

\boldsymbol{y}

is row vector. This type of integral arises for example from the simulation of a normal distribution with a general covariance matrix, from the approximation of the expectation value of solutions of PDEs with random coefficients, or from applications from statistics. In this paper we design QMC quadrature points

\boldsymbol{y}_0, ..., \boldsymbol{y}_{N-1} \in [0,1]^s

such that for the matrix

Y = (\boldsymbol{y}_{0}^\top, ..., \boldsymbol{y}_{N-1}^\top)^\top

whose rows are the quadrature points, one can use the fast Fourier transform to compute the matrix-vector product

Y \boldsymbol{a}^\top

\boldsymbol{a} \in \mathbb{R}^s

, in

\mathcal{O}(N \log N)

operations and at most

s-1

extra additions. The proposed method can be applied to lattice rules, polynomial lattice rules and a certain type of Korobov

p

-set. The approach is illustrated computationally by three numerical experiments. The first test considers the generation of points with normal distribution and general covariance matrix, the second test applies QMC to high-dimensional, affine-parametric, elliptic partial differential equations with uniformly distributed random coefficients, and the third test addresses Finite-Element discretizations of elliptic partial differential equations with high-dimensional, log-normal random input data. All numerical tests show a significant speed-up of the computation times of the fast QMC matrix method compared to a conventional implementation as the dimension becomes large

arXiv.org e-Print Archive

UNSWorks