14 research outputs found
Polynomial approximations for the matrix logarithm with computation graphs
The most popular method for computing the matrix logarithm is a combination
of the inverse scaling and squaring method in conjunction with a Pad\'e
approximation, sometimes accompanied by the Schur decomposition. The main
computational effort lies in matrix-matrix multiplications and left matrix
division. In this work we illustrate that the number of such operations can be
substantially reduced, by using a graph based representation of an efficient
polynomial evaluation scheme. A technique to analyze the rounding error is
proposed, and backward error analysis is adapted. We provide substantial
simulations illustrating competitiveness both in terms of computation time and
rounding errors
An Improved Taylor Algorithm for Computing the Matrix Logarithm
[EN] The most popular method for computing the matrix logarithm is a combination of the inverse scaling and squaring method in conjunction with a Pade approximation, sometimes accompanied by the Schur decomposition. In this work, we present a Taylor series algorithm, based on the free-transformation approach of the inverse scaling and squaring technique, that uses recent matrix polynomial formulas for evaluating the Taylor approximation of the matrix logarithm more efficiently than the Paterson-Stockmeyer method. Two MATLAB implementations of this algorithm, related to relative forward or backward error analysis, were developed and compared with different state-of-the art MATLAB functions. Numerical tests showed that the new implementations are generally more accurate than the previously available codes, with an intermediate execution time among all the codes in comparison.This research was funded by the European Regional Development Fund (ERDF) and the Spanish Ministerio de Economia y Competitividad Grant TIN2017-89314-P.Ibáñez González, JJ.; Sastre, J.; Ruíz Martínez, PA.; Alonso Abalos, JM.; Defez Candel, E. (2021). An Improved Taylor Algorithm for Computing the Matrix Logarithm. Mathematics. 9(17):1-19. https://doi.org/10.3390/math9172018S11991
Optimal Query Complexities for Dynamic Trace Estimation
We consider the problem of minimizing the number of matrix-vector queries
needed for accurate trace estimation in the dynamic setting where our
underlying matrix is changing slowly, such as during an optimization process.
Specifically, for any matrices with consecutive differences
bounded in Schatten- norm by , we provide a novel binary tree
summation procedure that simultaneously estimates all traces up to
error with failure probability with an optimal query
complexity of , improving the dependence on both and
from Dharangutte and Musco (NeurIPS, 2021). Our procedure works without
additional norm bounds on and can be generalized to a bound for the
-th Schatten norm for , giving a complexity of
.
By using novel reductions to communication complexity and
information-theoretic analyses of Gaussian matrices, we provide matching lower
bounds for static and dynamic trace estimation in all relevant parameters,
including the failure probability. Our lower bounds (1) give the first tight
bounds for Hutchinson's estimator in the matrix-vector product model with
Frobenius norm error even in the static setting, and (2) are the first
unconditional lower bounds for dynamic trace estimation, resolving open
questions of prior work.Comment: 30 page
Geometric variational inference
Efficiently accessing the information contained in non-linear and high
dimensional probability distributions remains a core challenge in modern
statistics. Traditionally, estimators that go beyond point estimates are either
categorized as Variational Inference (VI) or Markov-Chain Monte-Carlo (MCMC)
techniques. While MCMC methods that utilize the geometric properties of
continuous probability distributions to increase their efficiency have been
proposed, VI methods rarely use the geometry. This work aims to fill this gap
and proposes geometric Variational Inference (geoVI), a method based on
Riemannian geometry and the Fisher information metric. It is used to construct
a coordinate transformation that relates the Riemannian manifold associated
with the metric to Euclidean space. The distribution, expressed in the
coordinate system induced by the transformation, takes a particularly simple
form that allows for an accurate variational approximation by a normal
distribution. Furthermore, the algorithmic structure allows for an efficient
implementation of geoVI which is demonstrated on multiple examples, ranging
from low-dimensional illustrative ones to non-linear, hierarchical Bayesian
inverse problems in thousands of dimensions.Comment: 42 pages, 18 figures, accepted by Entrop
Compression, inversion, and approximate PCA of dense kernel matrices at near-linear computational complexity
Dense kernel matrices obtained from point evaluations of a covariance function at locations arise in statistics, machine learning, and numerical analysis. For covariance functions that are Green's functions of elliptic boundary value problems and homogeneously distributed sampling points, we show how to identify a subset , with , such that the zero fill-in incomplete Cholesky factorization of the sparse matrix is an -approximation of . This factorization can provably be obtained in complexity in space and in time, improving upon the state of the art for general elliptic operators; we further present numerical evidence that can be taken to be the intrinsic dimension of the data set rather than that of the ambient space. The algorithm only needs to know the spatial configuration of the and does not require an analytic representation of . Furthermore, this factorization straightforwardly provides an approximate sparse PCA with optimal rate of convergence in the operator norm. Hence, by using only subsampling and the incomplete Cholesky factorization, we obtain, at nearly linear complexity, the compression, inversion, and approximate PCA of a large class of covariance matrices. By inverting the order of the Cholesky factorization we also obtain a solver for elliptic PDE with complexity in space and in time, improving upon the state of the art for general elliptic operators
Probabilistic Linear Algebra
Linear algebra operations are at the core of many computational tasks. For example,
evaluating the density of a multivariate normal distribution requires the solution of a linear
equation system and the determinant of a square matrix. Frequently, and in particular in
machine learning, the size of the involved matrices is too large to compute exact solutions,
and necessitate approximation. Building upon recent work (Hennig and Kiefel 2012; Hennig
2015) this thesis considers numerical linear algebra from a probabilistic perspective.
Part iii establishes connections between approximate linear solvers and Gaussian inference,
with a focus on projection methods. One result is the observation that solution-based
inference (Cockayne, Oates, Ipsen, and Girolami 2018) is subsumed in the matrix-based
inference perspective (Hennig and Kiefel 2012). Part iv shows how the probabilistic
viewpoint leads to a novel algorithm for kernel least-squares problems. A Gaussian model
over kernel functions is proposed that uses matrix-multiplications computed by conjugate
gradients to obtain a low-rank approximation of the kernel. The derived algorithm kernel
machine conjugate gradients provides empirically better approximations than conjugate
gradients and, when used for Gaussian process regression, additionally provides estimates
for posterior variance and log-marginal likelihood, without the need to rerun. Part v is
concerned with the approximation of kernel matrix determinants. Assuming the inputs
to the kernel are independent and identically distributed, a stopping condition for the
Cholesky decomposition is presented that provides probably approximately correct (PAC)
estimates of the log-determinant with only little overhead