304,981 research outputs found
The border support rank of two-by-two matrix multiplication is seven
We show that the border support rank of the tensor corresponding to
two-by-two matrix multiplication is seven over the complex numbers. We do this
by constructing two polynomials that vanish on all complex tensors with format
four-by-four-by-four and border rank at most six, but that do not vanish
simultaneously on any tensor with the same support as the two-by-two matrix
multiplication tensor. This extends the work of Hauenstein, Ikenmeyer, and
Landsberg. We also give two proofs that the support rank of the two-by-two
matrix multiplication tensor is seven over any field: one proof using a result
of De Groote saying that the decomposition of this tensor is unique up to
sandwiching, and another proof via the substitution method. These results
answer a question asked by Cohn and Umans. Studying the border support rank of
the matrix multiplication tensor is relevant for the design of matrix
multiplication algorithms, because upper bounds on the border support rank of
the matrix multiplication tensor lead to upper bounds on the computational
complexity of matrix multiplication, via a construction of Cohn and Umans.
Moreover, support rank has applications in quantum communication complexity
Strong Scaling of Matrix Multiplication Algorithms and Memory-Independent Communication Lower Bounds
A parallel algorithm has perfect strong scaling if its running time on P
processors is linear in 1/P, including all communication costs.
Distributed-memory parallel algorithms for matrix multiplication with perfect
strong scaling have only recently been found. One is based on classical matrix
multiplication (Solomonik and Demmel, 2011), and one is based on Strassen's
fast matrix multiplication (Ballard, Demmel, Holtz, Lipshitz, and Schwartz,
2012). Both algorithms scale perfectly, but only up to some number of
processors where the inter-processor communication no longer scales.
We obtain a memory-independent communication cost lower bound on classical
and Strassen-based distributed-memory matrix multiplication algorithms. These
bounds imply that no classical or Strassen-based parallel matrix multiplication
algorithm can strongly scale perfectly beyond the ranges already attained by
the two parallel algorithms mentioned above. The memory-independent bounds and
the strong scaling bounds generalize to other algorithms.Comment: 4 pages, 1 figur
Optical implementation of systolic array processing
Algorithms for matrix vector multiplication are implemented using acousto-optic cells for multiplication and input data transfer and using charge coupled devices detector arrays for accumulation and output of the results. No two dimensional matrix mask is required; matrix changes are implemented electronically. A system for multiplying a 50 component nonnegative real vector by a 50 by 50 nonnegative real matrix is described. Modifications for bipolar real and complex valued processing are possible, as are extensions to matrix-matrix multiplication and multiplication of a vector by multiple matrices
Fast QMC matrix-vector multiplication
Quasi-Monte Carlo (QMC) rules
can be used to approximate integrals of the form , where is a matrix and
is row vector. This type of integral arises for example from
the simulation of a normal distribution with a general covariance matrix, from
the approximation of the expectation value of solutions of PDEs with random
coefficients, or from applications from statistics. In this paper we design QMC
quadrature points
such that for the matrix whose rows are the quadrature points, one can
use the fast Fourier transform to compute the matrix-vector product , , in operations and at most extra additions. The proposed method can be
applied to lattice rules, polynomial lattice rules and a certain type of
Korobov -set.
The approach is illustrated computationally by three numerical experiments.
The first test considers the generation of points with normal distribution and
general covariance matrix, the second test applies QMC to high-dimensional,
affine-parametric, elliptic partial differential equations with uniformly
distributed random coefficients, and the third test addresses Finite-Element
discretizations of elliptic partial differential equations with
high-dimensional, log-normal random input data. All numerical tests show a
significant speed-up of the computation times of the fast QMC matrix method
compared to a conventional implementation as the dimension becomes large
- …