8 research outputs found
Tail bounds for all eigenvalues of a sum of random matrices
This work introduces the minimax Laplace transform method, a modification of
the cumulant-based matrix Laplace transform method developed in "User-friendly
tail bounds for sums of random matrices" (arXiv:1004.4389v6) that yields both
upper and lower bounds on each eigenvalue of a sum of random self-adjoint
matrices. This machinery is used to derive eigenvalue analogues of the
classical Chernoff, Bennett, and Bernstein bounds.
Two examples demonstrate the efficacy of the minimax Laplace transform. The
first concerns the effects of column sparsification on the spectrum of a matrix
with orthonormal rows. Here, the behavior of the singular values can be
described in terms of coherence-like quantities. The second example addresses
the question of relative accuracy in the estimation of eigenvalues of the
covariance matrix of a random process. Standard results on the convergence of
sample covariance matrices provide bounds on the number of samples needed to
obtain relative accuracy in the spectral norm, but these results only guarantee
relative accuracy in the estimate of the maximum eigenvalue. The minimax
Laplace transform argument establishes that if the lowest eigenvalues decay
sufficiently fast, on the order of (K^2*r*log(p))/eps^2 samples, where K is the
condition number of an optimal rank-r approximation to C, are sufficient to
ensure that the dominant r eigenvalues of the covariance matrix of a N(0, C)
random vector are estimated to within a factor of 1+-eps with high probability.Comment: 20 pages, 1 figure, see also arXiv:1004.4389v
The Masked Sample Covariance Estimator: An Analysis via Matrix Concentration Inequalities
Covariance estimation becomes challenging in the regime where the number p of
variables outstrips the number n of samples available to construct the
estimate. One way to circumvent this problem is to assume that the covariance
matrix is nearly sparse and to focus on estimating only the significant
entries. To analyze this approach, Levina and Vershynin (2011) introduce a
formalism called masked covariance estimation, where each entry of the sample
covariance estimator is reweighted to reflect an a priori assessment of its
importance. This paper provides a short analysis of the masked sample
covariance estimator by means of a matrix concentration inequality. The main
result applies to general distributions with at least four moments. Specialized
to the case of a Gaussian distribution, the theory offers qualitative
improvements over earlier work. For example, the new results show that n = O(B
log^2 p) samples suffice to estimate a banded covariance matrix with bandwidth
B up to a relative spectral-norm error, in contrast to the sample complexity n
= O(B log^5 p) obtained by Levina and Vershynin
Topics in Randomized Numerical Linear Algebra
This thesis studies three classes of randomized numerical linear algebra algorithms, namely: (i) randomized matrix sparsification algorithms, (ii) low-rank approximation algorithms that use randomized unitary transformations, and (iii) low-rank approximation algorithms for positive-semidefinite (PSD) matrices.
Randomized matrix sparsification algorithms set randomly chosen entries of the input matrix to zero. When the approximant is substituted for the original matrix in computations, its sparsity allows one to employ faster sparsity-exploiting algorithms. This thesis contributes bounds on the approximation error of nonuniform randomized sparsification schemes, measured in the spectral norm and two NP-hard norms that are of interest in computational graph theory and subset selection applications.
Low-rank approximations based on randomized unitary transformations have several desirable properties: they have low communication costs, are amenable to parallel implementation, and exploit the existence of fast transform algorithms. This thesis investigates the tradeoff between the accuracy and cost of generating such approximations. State-of-the-art spectral and Frobenius-norm error bounds are provided.
The last class of algorithms considered are SPSD "sketching" algorithms. Such sketches can be computed faster than approximations based on projecting onto mixtures of the columns of the matrix. The performance of several such sketching schemes is empirically evaluated using a suite of canonical matrices drawn from machine learning and data analysis applications, and a framework is developed for establishing theoretical error bounds.
In addition to studying these algorithms, this thesis extends the Matrix Laplace Transform framework to derive Chernoff and Bernstein inequalities that apply to all the eigenvalues of certain classes of random matrices. These inequalities are used to investigate the behavior of the singular values of a matrix under random sampling, and to derive convergence rates for each individual eigenvalue of a sample covariance matrix.</p
1 Texture-Based Tissue Characterization for High-resolution CT Scans of Coronary Arteries
We analyze localized textural consistencies in high-resolution X-ray CT scans of coronary arteries to identify the appearance of diagnostically relevant changes in tissue. For the efficient and accurate processing of CT volume data, we use fast wavelet algorithms associated with three-dimensional isotropic multiresolution wavelets that implement a redundant, frame-based image encoding without directional preference. Our algorithm identifies textural consistencies by correlating coefficients in the wavelet representation. I