Search CORE

35 research outputs found

Quantum and approximation algorithms for maximum witnesses of Boolean matrix products

Author: A Ambainis
A Czumaj
A Shapira
D Coppersmith
F Gall
K Cohen
L Gąsieniec
M Nielsen
N Alon
V Vassilevska
X Huang
Publication venue
Publication date: 01/01/2021
Field of study

The problem of finding maximum (or minimum) witnesses of the Boolean product of two Boolean matrices (MW for short) has a number of important applications, in particular the all-pairs lowest common ancestor (LCA) problem in directed acyclic graphs (dags). The best known upper time-bound on the MW problem for n\times n Boolean matrices of the form O(n^{2.575}) has not been substantially improved since 2006. In order to obtain faster algorithms for this problem, we study quantum algorithms for MW and approximation algorithms for MW (in the standard computational model). Some of our quantum algorithms are input or output sensitive. Our fastest quantum algorithm for the MW problem, and consequently for the related problems, runs in time \tilde{O}(n^{2+\lambda/2})=\tilde{O}(n^{2.434}), where \lambda satisfies the equation \omega(1, \lambda, 1) = 1 + 1.5 \, \lambda and \omega(1, \lambda, 1) is the exponent of the multiplication of an n \times n^{\lambda}$ matrix by an n^{\lambda} \times n matrix. Next, we consider a relaxed version of the MW problem (in the standard model) asking for reporting a witness of bounded rank (the maximum witness has rank 1) for each non-zero entry of the matrix product. First, by adapting the fastest known algorithm for maximum witnesses, we obtain an algorithm for the relaxed problem that reports for each non-zero entry of the product matrix a witness of rank at most \ell in time \tilde{O}((n/\ell)n^{\omega(1,\log_n \ell,1)}). Then, by reducing the relaxed problem to the so called k-witness problem, we provide an algorithm that reports for each non-zero entry C[i,j] of the product matrix C a witness of rank O(\lceil W_C(i,j)/k\rceil ), where W_C(i,j) is the number of witnesses for C[i,j], with high probability. The algorithm runs in \tilde{O}(n^{\omega}k^{0.4653} +n^2k) time, where \omega=\omega(1,1,1).Comment: 14 pages, 3 figure

arXiv.org e-Print Archive

Lund University Publications

Crossref

Computational linear algebra over finite fields

Author: Dumas Jean-Guillaume
Pernet Clément
Publication venue
Publication date: 17/04/2012
Field of study

We present here algorithms for efficient computation of linear algebra problems over finite fields

arXiv.org e-Print Archive

CiteSeerX

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

An \~{O} $(n^2)$ Time Matrix Multiplication Algorithm

Author: Han Yijie
Publication venue
Publication date: 07/09/2020
Field of study

We show, for the input vectors

(a_0, a_1, ..., a_{n-1})

and

(b_0, b_1, ..., b_{n-1})

, where

a_i

's and

b_j

's are real numbers, after \~{O}

(n)

time preprocessing for each of them, the vector multiplication

(a_0, a_1, ..., a_{n-1})(b_0, b_1, ..., b_{n-1})^T

can be computed in \~{O}

(1)

time. This enables the matrix multiplication of two

n\times n

matrices to be computed in \~{O}

(n^2)

time.Comment: Version 11 and Version 12 section 2 laid the foundation of this algorithm but has a problem unresolved. This version corrects the problem in Version 11 and Section 2 of Version 1

arXiv.org e-Print Archive

Graph Expansion and Communication Costs of Fast Matrix Multiplication

Author: Ballard Grey
Demmel James
Holtz Olga
Schwartz Oded
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2011
Field of study

The communication cost of algorithms (also known as I/O-complexity) is shown to be closely related to the expansion properties of the corresponding computation graphs. We demonstrate this on Strassen's and other fast matrix multiplication algorithms, and obtain first lower bounds on their communication costs. In the sequential case, where the processor has a fast memory of size

M

, too small to store three

n

-by-

n

matrices, the lower bound on the number of words moved between fast and slow memory is, for many of the matrix multiplication algorithms,

\Omega((\frac{n}{\sqrt M})^{\omega_0}\cdot M)

, where

\omega_0

is the exponent in the arithmetic count (e.g.,

\omega_0 = \lg 7

for Strassen, and

\omega_0 = 3

for conventional matrix multiplication). With

p

parallel processors, each with fast memory of size

M

, the lower bound is

p

times smaller. These bounds are attainable both for sequential and for parallel algorithms and hence optimal. These bounds can also be attained by many fast algorithms in linear algebra (e.g., algorithms for LU, QR, and solving the Sylvester equation)

arXiv.org e-Print Archive

CiteSeerX

Crossref

Faster all-pairs shortest paths via circuit complexity

Author: Aho Alfred V.
Ballard Grey
Bremner David
Kerr Leslie R.
Publication venue
Publication date: 21/05/2014
Field of study

We present a new randomized method for computing the min-plus product (a.k.a., tropical product) of two

n \times n

matrices, yielding a faster algorithm for solving the all-pairs shortest path problem (APSP) in dense

n

-node directed graphs with arbitrary edge weights. On the real RAM, where additions and comparisons of reals are unit cost (but all other operations have typical logarithmic cost), the algorithm runs in time

\frac{n^3}{2^{\Omega(\log n)^{1/2}}}

and is correct with high probability. On the word RAM, the algorithm runs in

n^3/2^{\Omega(\log n)^{1/2}} + n^{2+o(1)}\log M

time for edge weights in

([0,M] \cap {\mathbb Z})\cup\{\infty\}

. Prior algorithms used either

n^3/(\log^c n)

time for various

c \leq 2

, or

O(M^{\alpha}n^{\beta})

time for various

\alpha > 0

and

\beta > 2

. The new algorithm applies a tool from circuit complexity, namely the Razborov-Smolensky polynomials for approximately representing

{\sf AC}^0[p]

circuits, to efficiently reduce a matrix product over the

(\min,+)

algebra to a relatively small number of rectangular matrix products over

{\mathbb F}_2

, each of which are computable using a particularly efficient method due to Coppersmith. We also give a deterministic version of the algorithm running in

n^3/2^{\log^{\delta} n}

time for some

\delta > 0

, which utilizes the Yao-Beigel-Tarui translation of

{\sf AC}^0[m]

circuits into "nice" depth-two circuits.Comment: 24 pages. Updated version now has slightly faster running time. To appear in ACM Symposium on Theory of Computing (STOC), 201

arXiv.org e-Print Archive

Crossref