18,726 research outputs found
On fast multiplication of a matrix by its transpose
We present a non-commutative algorithm for the multiplication of a
2x2-block-matrix by its transpose using 5 block products (3 recursive calls and
2 general products) over C or any finite field.We use geometric considerations
on the space of bilinear forms describing 2x2 matrix products to obtain this
algorithm and we show how to reduce the number of involved additions.The
resulting algorithm for arbitrary dimensions is a reduction of multiplication
of a matrix by its transpose to general matrix product, improving by a constant
factor previously known reductions.Finally we propose schedules with low memory
footprint that support a fast and memory efficient practical implementation
over a finite field.To conclude, we show how to use our result in LDLT
factorization.Comment: ISSAC 2020, Jul 2020, Kalamata, Greec
On fast multiplication of a matrix by its transpose
We present a non-commutative algorithm for the multiplication of a block-matrix by its transpose over C or any finite field using 5 recursive products. We use geometric considerations on the space of bilinear forms describing 2×2 matrix products to obtain this algorithm and we show how to reduce the number of involved additions. The resulting algorithm for arbitrary dimensions is a reduction of multiplication of a matrix by its transpose to general matrix product, improving by a constant factor previously known reductions. Finally we propose space and time efficient schedules that enable us to provide fast practical implementations for higher-dimensional matrix products
Tensor Transpose and Its Properties
Tensor transpose is a higher order generalization of matrix transpose. In
this paper, we use permutations and symmetry group to define? the tensor
transpose. Then we discuss the classification and composition of tensor
transposes. Properties of tensor transpose are studied in relation to tensor
multiplication, tensor eigenvalues, tensor decompositions and tensor rank
A pseudospectral matrix method for time-dependent tensor fields on a spherical shell
We construct a pseudospectral method for the solution of time-dependent,
non-linear partial differential equations on a three-dimensional spherical
shell. The problem we address is the treatment of tensor fields on the sphere.
As a test case we consider the evolution of a single black hole in numerical
general relativity. A natural strategy would be the expansion in tensor
spherical harmonics in spherical coordinates. Instead, we consider the simpler
and potentially more efficient possibility of a double Fourier expansion on the
sphere for tensors in Cartesian coordinates. As usual for the double Fourier
method, we employ a filter to address time-step limitations and certain
stability issues. We find that a tensor filter based on spin-weighted spherical
harmonics is successful, while two simplified, non-spin-weighted filters do not
lead to stable evolutions. The derivatives and the filter are implemented by
matrix multiplication for efficiency. A key technical point is the construction
of a matrix multiplication method for the spin-weighted spherical harmonic
filter. As example for the efficient parallelization of the double Fourier,
spin-weighted filter method we discuss an implementation on a GPU, which
achieves a speed-up of up to a factor of 20 compared to a single core CPU
implementation.Comment: 33 pages, 9 figure
Practical Sparse Matrices in C++ with Hybrid Storage and Template-Based Expression Optimisation
Despite the importance of sparse matrices in numerous fields of science,
software implementations remain difficult to use for non-expert users,
generally requiring the understanding of underlying details of the chosen
sparse matrix storage format. In addition, to achieve good performance, several
formats may need to be used in one program, requiring explicit selection and
conversion between the formats. This can be both tedious and error-prone,
especially for non-expert users. Motivated by these issues, we present a
user-friendly and open-source sparse matrix class for the C++ language, with a
high-level application programming interface deliberately similar to the widely
used MATLAB language. This facilitates prototyping directly in C++ and aids the
conversion of research code into production environments. The class internally
uses two main approaches to achieve efficient execution: (i) a hybrid storage
framework, which automatically and seamlessly switches between three underlying
storage formats (compressed sparse column, Red-Black tree, coordinate list)
depending on which format is best suited and/or available for specific
operations, and (ii) a template-based meta-programming framework to
automatically detect and optimise execution of common expression patterns.
Empirical evaluations on large sparse matrices with various densities of
non-zero elements demonstrate the advantages of the hybrid storage framework
and the expression optimisation mechanism.Comment: extended and revised version of an earlier conference paper
arXiv:1805.0338
Practical Sparse Matrices in C++ with Hybrid Storage and Template-Based Expression Optimisation
Despite the importance of sparse matrices in numerous fields of science,
software implementations remain difficult to use for non-expert users,
generally requiring the understanding of underlying details of the chosen
sparse matrix storage format. In addition, to achieve good performance, several
formats may need to be used in one program, requiring explicit selection and
conversion between the formats. This can be both tedious and error-prone,
especially for non-expert users. Motivated by these issues, we present a
user-friendly and open-source sparse matrix class for the C++ language, with a
high-level application programming interface deliberately similar to the widely
used MATLAB language. This facilitates prototyping directly in C++ and aids the
conversion of research code into production environments. The class internally
uses two main approaches to achieve efficient execution: (i) a hybrid storage
framework, which automatically and seamlessly switches between three underlying
storage formats (compressed sparse column, Red-Black tree, coordinate list)
depending on which format is best suited and/or available for specific
operations, and (ii) a template-based meta-programming framework to
automatically detect and optimise execution of common expression patterns.
Empirical evaluations on large sparse matrices with various densities of
non-zero elements demonstrate the advantages of the hybrid storage framework
and the expression optimisation mechanism.Comment: extended and revised version of an earlier conference paper
arXiv:1805.0338
- …