11 research outputs found
Interpolating the Trace of the Inverse of Matrix
We develop heuristic interpolation methods for the function , where the
matrices and are symmetric and positive definite and
is a real variable. This function is featured in many applications in
statistics, machine learning, and computational physics. The presented
interpolation functions are based on the modification of a sharp upper bound
that we derive for this function, which is a new trace inequality for matrices.
We demonstrate the accuracy and performance of the proposed method with
numerical examples, namely, the marginal maximum likelihood estimation for
linear Gaussian process regression and the estimation of the regularization
parameter of ridge regression with the generalized cross-validation method
Analysis of stochastic probing methods for estimating the trace of functions of sparse symmetric matrices
We consider the problem of estimating the trace of a matrix function .
In certain situations, in particular if cannot be well approximated by a
low-rank matrix, combining probing methods based on graph colorings with
stochastic trace estimation techniques can yield accurate approximations at
moderate cost. So far, such methods have not been thoroughly analyzed, though,
but were rather used as efficient heuristics by practitioners. In this
manuscript, we perform a detailed analysis of stochastic probing methods and,
in particular, expose conditions under which the expected approximation error
in the stochastic probing method scales more favorably with the dimension of
the matrix than the error in non-stochastic probing. Extending results from [E.
Aune, D. P. Simpson, J. Eidsvik, Parameter estimation in high dimensional
Gaussian distributions, Stat. Comput., 24, pp. 247--263, 2014], we also
characterize situations in which using just one stochastic vector is always --
not only in expectation -- better than the deterministic probing method.
Several numerical experiments illustrate our theory and compare with existing
methods
Randomized low-rank approximation of monotone matrix functions
This work is concerned with computing low-rank approximations of a matrix
function for a large symmetric positive semi-definite matrix , a task
that arises in, e.g., statistical learning and inverse problems. The
application of popular randomized methods, such as the randomized singular
value decomposition or the Nystr\"om approximation, to requires
multiplying with a few random vectors. A significant disadvantage of
such an approach, matrix-vector products with are considerably more
expensive than matrix-vector products with , even when carried out only
approximately via, e.g., the Lanczos method. In this work, we present and
analyze funNystr\"om, a simple and inexpensive method that constructs a
low-rank approximation of directly from a Nystr\"om approximation of
, completely bypassing the need for matrix-vector products with . It
is sensible to use funNystr\"om whenever is monotone and satisfies . Under the stronger assumption that is operator monotone, which includes
the matrix square root and the matrix logarithm , we
derive probabilistic bounds for the error in the Frobenius, nuclear, and
operator norms. These bounds confirm the numerical observation that
funNystr\"om tends to return an approximation that compares well with the best
low-rank approximation of . Our method is also of interest when
estimating quantities associated with , such as the trace or the diagonal
entries of . In particular, we propose and analyze funNystr\"om++, a
combination of funNystr\"om with the recently developed Hutch++ method for
trace estimation
XTrace: Making the most of every sample in stochastic trace estimation
The implicit trace estimation problem asks for an approximation of the trace
of a square matrix, accessed via matrix-vector products (matvecs). This paper
designs new randomized algorithms, XTrace and XNysTrace, for the trace
estimation problem by exploiting both variance reduction and the
exchangeability principle. For a fixed budget of matvecs, numerical experiments
show that the new methods can achieve errors that are orders of magnitude
smaller than existing algorithms, such as the Girard-Hutchinson estimator or
the Hutch++ estimator. A theoretical analysis confirms the benefits by offering
a precise description of the performance of these algorithms as a function of
the spectrum of the input matrix. The paper also develops an exchangeable
estimator, XDiag, for approximating the diagonal of a square matrix using
matvecs.Comment: 31 pages, 8 figure
Optimal Query Complexities for Dynamic Trace Estimation
We consider the problem of minimizing the number of matrix-vector queries
needed for accurate trace estimation in the dynamic setting where our
underlying matrix is changing slowly, such as during an optimization process.
Specifically, for any matrices with consecutive differences
bounded in Schatten- norm by , we provide a novel binary tree
summation procedure that simultaneously estimates all traces up to
error with failure probability with an optimal query
complexity of , improving the dependence on both and
from Dharangutte and Musco (NeurIPS, 2021). Our procedure works without
additional norm bounds on and can be generalized to a bound for the
-th Schatten norm for , giving a complexity of
.
By using novel reductions to communication complexity and
information-theoretic analyses of Gaussian matrices, we provide matching lower
bounds for static and dynamic trace estimation in all relevant parameters,
including the failure probability. Our lower bounds (1) give the first tight
bounds for Hutchinson's estimator in the matrix-vector product model with
Frobenius norm error even in the static setting, and (2) are the first
unconditional lower bounds for dynamic trace estimation, resolving open
questions of prior work.Comment: 30 page
Krylov-aware stochastic trace estimation
We introduce an algorithm for estimating the trace of a matrix function
using implicit products with a symmetric matrix .
Existing methods for implicit trace estimation of a matrix function tend to
treat matrix-vector products with as a black-box to be computed
by a Krylov subspace method. Like other recent algorithms for implicit trace
estimation, our approach is based on a combination of deflation and stochastic
trace estimation. However, we take a closer look at how products with
are integrated into these approaches which enables several
efficiencies not present in previously studied methods. In particular, we
describe a Krylov subspace method for computing a low-rank approximation of a
matrix function by a computationally efficient projection onto Krylov subspace.Comment: Figure 5.1 differs somewhat from the published version due to a
clerical error made when uploading the images to the journa
FAST AND MEMORY EFFICIENT ALGORITHMS FOR STRUCTURED MATRIX SPECTRUM APPROXIMATION
Approximating the singular values or eigenvalues of a matrix, i.e. spectrum approximation, is a fundamental task in data science and machine learning applications. While approximation of the top singular values has received considerable attention in numerical linear algebra, provably efficient algorithms for other spectrum approximation tasks such as spectral-sum estimation and spectrum density estimation are starting to emerge only recently. Two crucial components that have enabled efficient algorithms for spectrum approximation are access to randomness and structure in the underlying matrix. In this thesis, we study how randomization and the underlying structure of the matrix can be exploited to design fast and memory efficient algorithms for spectral sum-estimation and spectrum density estimation. In particular, we look at two classes of structure: sparsity and graph structure.
In the first part of this thesis, we show that sparsity can be exploited to give low-memory algorithms for spectral summarization tasks such as approximating some Schatten norms, the Estrada index and the logarithm of the determinant (log-det) of a sparse matrix. Surprisingly, we show that the space complexity of our algorithms are independent of the underlying dimension of the matrix. Complimenting our result for sparse matrices, we show that matrices that satisfy a certain smooth definition of sparsity, but potentially dense in the conventional sense, can be approximated in spectral-norm error by a truly sparse matrix. Our method is based on a simple sampling scheme that can be implemented in linear time. In the second part, we give the first truly sublinear time algorithm to approximate the spectral density of the (normalized) adjacency matrix of an undirected, unweighted graph in earth-mover distance. In addition to our sublinear time result, we give theoretical guarantees for a variant of the widely-used Kernel Polynomial Method and propose a new moment matching based method for spectrum density estimation of Hermitian matrices