465 research outputs found

    Faster all-pairs shortest paths via circuit complexity

    Full text link
    We present a new randomized method for computing the min-plus product (a.k.a., tropical product) of two n×nn \times n matrices, yielding a faster algorithm for solving the all-pairs shortest path problem (APSP) in dense nn-node directed graphs with arbitrary edge weights. On the real RAM, where additions and comparisons of reals are unit cost (but all other operations have typical logarithmic cost), the algorithm runs in time n32Ω(logn)1/2\frac{n^3}{2^{\Omega(\log n)^{1/2}}} and is correct with high probability. On the word RAM, the algorithm runs in n3/2Ω(logn)1/2+n2+o(1)logMn^3/2^{\Omega(\log n)^{1/2}} + n^{2+o(1)}\log M time for edge weights in ([0,M]Z){}([0,M] \cap {\mathbb Z})\cup\{\infty\}. Prior algorithms used either n3/(logcn)n^3/(\log^c n) time for various c2c \leq 2, or O(Mαnβ)O(M^{\alpha}n^{\beta}) time for various α>0\alpha > 0 and β>2\beta > 2. The new algorithm applies a tool from circuit complexity, namely the Razborov-Smolensky polynomials for approximately representing AC0[p]{\sf AC}^0[p] circuits, to efficiently reduce a matrix product over the (min,+)(\min,+) algebra to a relatively small number of rectangular matrix products over F2{\mathbb F}_2, each of which are computable using a particularly efficient method due to Coppersmith. We also give a deterministic version of the algorithm running in n3/2logδnn^3/2^{\log^{\delta} n} time for some δ>0\delta > 0, which utilizes the Yao-Beigel-Tarui translation of AC0[m]{\sf AC}^0[m] circuits into "nice" depth-two circuits.Comment: 24 pages. Updated version now has slightly faster running time. To appear in ACM Symposium on Theory of Computing (STOC), 201

    Bit Complexity of Jordan Normal Form and Polynomial Spectral Factorization

    Get PDF

    Improving Distributed Gradient Descent Using Reed-Solomon Codes

    Get PDF
    Today's massively-sized datasets have made it necessary to often perform computations on them in a distributed manner. In principle, a computational task is divided into subtasks which are distributed over a cluster operated by a taskmaster. One issue faced in practice is the delay incurred due to the presence of slow machines, known as \emph{stragglers}. Several schemes, including those based on replication, have been proposed in the literature to mitigate the effects of stragglers and more recently, those inspired by coding theory have begun to gain traction. In this work, we consider a distributed gradient descent setting suitable for a wide class of machine learning problems. We adapt the framework of Tandon et al. (arXiv:1612.03301) and present a deterministic scheme that, for a prescribed per-machine computational effort, recovers the gradient from the least number of machines ff theoretically permissible, via an O(f2)O(f^2) decoding algorithm. We also provide a theoretical delay model which can be used to minimize the expected waiting time per computation by optimally choosing the parameters of the scheme. Finally, we supplement our theoretical findings with numerical results that demonstrate the efficacy of the method and its advantages over competing schemes

    Faster Sparse Matrix Inversion and Rank Computation in Finite Fields

    Full text link
    We improve the current best running time value to invert sparse matrices over finite fields, lowering it to an expected O(n2.2131)O\big(n^{2.2131}\big) time for the current values of fast rectangular matrix multiplication. We achieve the same running time for the computation of the rank and nullspace of a sparse matrix over a finite field. This improvement relies on two key techniques. First, we adopt the decomposition of an arbitrary matrix into block Krylov and Hankel matrices from Eberly et al. (ISSAC 2007). Second, we show how to recover the explicit inverse of a block Hankel matrix using low displacement rank techniques for structured matrices and fast rectangular matrix multiplication algorithms. We generalize our inversion method to block structured matrices with other displacement operators and strengthen the best known upper bounds for explicit inversion of block Toeplitz-like and block Hankel-like matrices, as well as for explicit inversion of block Vandermonde-like matrices with structured blocks. As a further application, we improve the complexity of several algorithms in topological data analysis and in finite group theory

    Bit Complexity of Jordan Normal Form and Spectral Factorization

    Full text link
    We study the bit complexity of two related fundamental computational problems in linear algebra and control theory. Our results are: (1) An O~(nω+3a+n4a2+nωlog(1/ϵ))\tilde{O}(n^{\omega+3}a+n^4a^2+n^\omega\log(1/\epsilon)) time algorithm for finding an ϵ\epsilon-approximation to the Jordan Normal form of an integer matrix with aa-bit entries, where ω\omega is the exponent of matrix multiplication. (2) An O~(n6d6a+n4d4a2+n3d3log(1/ϵ))\tilde{O}(n^6d^6a+n^4d^4a^2+n^3d^3\log(1/\epsilon)) time algorithm for ϵ\epsilon-approximately computing the spectral factorization P(x)=Q(x)Q(x)P(x)=Q^*(x)Q(x) of a given monic n×nn\times n rational matrix polynomial of degree 2d2d with rational aa-bit coefficients having aa-bit common denominators, which satisfies P(x)0P(x)\succeq 0 for all real xx. The first algorithm is used as a subroutine in the second one. Despite its being of central importance, polynomial complexity bounds were not previously known for spectral factorization, and for Jordan form the best previous best running time was an unspecified polynomial in nn of degree at least twelve \cite{cai1994computing}. Our algorithms are simple and judiciously combine techniques from numerical and symbolic computation, yielding significant advantages over either approach by itself.Comment: 19p
    corecore