2 research outputs found
A Simple and Efficient Tensor Calculus for Machine Learning
Computing derivatives of tensor expressions, also known as tensor calculus,
is a fundamental task in machine learning. A key concern is the efficiency of
evaluating the expressions and their derivatives that hinges on the
representation of these expressions. Recently, an algorithm for computing
higher order derivatives of tensor expressions like Jacobians or Hessians has
been introduced that is a few orders of magnitude faster than previous
state-of-the-art approaches. Unfortunately, the approach is based on Ricci
notation and hence cannot be incorporated into automatic differentiation
frameworks from deep learning like TensorFlow, PyTorch, autograd, or JAX that
use the simpler Einstein notation. This leaves two options, to either change
the underlying tensor representation in these frameworks or to develop a new,
provably correct algorithm based on Einstein notation. Obviously, the first
option is impractical. Hence, we pursue the second option. Here, we show that
using Ricci notation is not necessary for an efficient tensor calculus and
develop an equally efficient method for the simpler Einstein notation. It turns
out that turning to Einstein notation enables further improvements that lead to
even better efficiency.
The methods that are described in this paper have been implemented in the
online tool www.MatrixCalculus.org for computing derivatives of matrix and
tensor expressions.
An extended abstract of this paper appeared as "A Simple and Efficient Tensor
Calculus", AAAI 2020
Matrix moments of the diffusion tensor distribution
Purpose: To facilitate the implementation/validation of signal
representations and models using parametric matrix-variate distributions to
approximate the diffusion tensor distribution (DTD) .
Theory: We establish practical mathematical tools, the matrix moments of the
DTD, enabling to compute the mean diffusion tensor and covariance tensor
associated with any parametric matrix-variate DTD whose moment-generating
function is known. As a proof of concept, we apply these tools to the
non-central matrix-variate Gamma (nc-mv-Gamma) distribution, whose covariance
tensor was so far unknown, and design a new signal representation capturing
intra-voxel heterogeneity via a single nc-mv-Gamma distribution: the
matrix-variate Gamma approximation. Methods: Furthering this proof of concept,
we evaluate the matrix-variate Gamma approximation in silico and in vivo, in a
human-brain 'tensor-valued' diffusion MRI dataset. Results: The matrix-variate
Gamma approximation fails to capture the heterogeneity arising from orientation
dispersion and from simultaneous variances in the trace (size) and anisotropy
(shape) of the underlying diffusion tensors, which is explained by the
structure of the covariance tensor associated with the nc-mv-Gamma
distribution. Conclusion: The matrix moments promote a more widespread use of
matrix-variate distributions as plausible approximations of the DTD by
alleviating their intractability, thereby facilitating the design/validation of
matrix-variate microstructural techniques.Comment: 17 pages, 6 figure