6 research outputs found

    On fast multiplication of a matrix by its transpose

    Get PDF
    We present a non-commutative algorithm for the multiplication of a 2x2-block-matrix by its transpose using 5 block products (3 recursive calls and 2 general products) over C or any finite field.We use geometric considerations on the space of bilinear forms describing 2x2 matrix products to obtain this algorithm and we show how to reduce the number of involved additions.The resulting algorithm for arbitrary dimensions is a reduction of multiplication of a matrix by its transpose to general matrix product, improving by a constant factor previously known reductions.Finally we propose schedules with low memory footprint that support a fast and memory efficient practical implementation over a finite field.To conclude, we show how to use our result in LDLT factorization.Comment: ISSAC 2020, Jul 2020, Kalamata, Greec

    Fault-Tolerant Strassen-Like Matrix Multiplication

    Full text link
    In this study, we propose a simple method for fault-tolerant Strassen-like matrix multiplications. The proposed method is based on using two distinct Strassen-like algorithms instead of replicating a given one. We have realized that using two different algorithms, new check relations arise resulting in more local computations. These local computations are found using computer aided search. To improve performance, special parity (extra) sub-matrix multiplications (PSMMs) are generated (two of them) at the expense of increasing communication/computation cost of the system. Our preliminary results demonstrate that the proposed method outperforms a Strassen-like algorithm with two copies and secures a very close performance to three copy version using only 2 PSMMs, reducing the total number of compute nodes by around 24\% i.e., from 21 to 16.Comment: 6 pages, 2 figure

    Faster Walsh-Hadamard Transform and Matrix Multiplication over Finite Fields using Lookup Tables

    Full text link
    We use lookup tables to design faster algorithms for important algebraic problems over finite fields. These faster algorithms, which only use arithmetic operations and lookup table operations, may help to explain the difficulty of determining the complexities of these important problems. Our results over a constant-sized finite field are as follows. The Walsh-Hadamard transform of a vector of length NN can be computed using O(NlogN/loglogN)O(N \log N / \log \log N) bit operations. This generalizes to any transform defined as a Kronecker power of a fixed matrix. By comparison, the Fast Walsh-Hadamard transform (similar to the Fast Fourier transform) uses O(NlogN)O(N \log N) arithmetic operations, which is believed to be optimal up to constant factors. Any algebraic algorithm for multiplying two N×NN \times N matrices using O(Nω)O(N^\omega) operations can be converted into an algorithm using O(Nω/(logN)ω/21)O(N^\omega / (\log N)^{\omega/2 - 1}) bit operations. For example, Strassen's algorithm can be converted into an algorithm using O(N2.81/(logN)0.4)O(N^{2.81} / (\log N)^{0.4}) bit operations. It remains an open problem with practical implications to determine the smallest constant cc such that Strassen's algorithm can be implemented to use cN2.81+o(N2.81)c \cdot N^{2.81} + o(N^{2.81}) arithmetic operations; using a lookup table allows one to save a super-constant factor in bit operations.Comment: 10 pages, to appear in the 6th Symposium on Simplicity in Algorithms (SOSA 2023

    On fast multiplication of a matrix by its transpose

    Get PDF
    We present a non-commutative algorithm for the multiplication of a block-matrix by its transpose over C or any finite field using 5 recursive products. We use geometric considerations on the space of bilinear forms describing 2×2 matrix products to obtain this algorithm and we show how to reduce the number of involved additions. The resulting algorithm for arbitrary dimensions is a reduction of multiplication of a matrix by its transpose to general matrix product, improving by a constant factor previously known reductions. Finally we propose space and time efficient schedules that enable us to provide fast practical implementations for higher-dimensional matrix products
    corecore