    Spectral Sparsification via Bounded-Independence Sampling

    We give a deterministic, nearly logarithmic-space algorithm for mild spectral sparsification of undirected graphs. Given a weighted, undirected graph GG on nn vertices described by a binary string of length NN, an integer klognk\leq \log n, and an error parameter ϵ>0\epsilon > 0, our algorithm runs in space O~(klog(Nwmax/wmin))\tilde{O}(k\log (N\cdot w_{\mathrm{max}}/w_{\mathrm{min}})) where wmaxw_{\mathrm{max}} and wminw_{\mathrm{min}} are the maximum and minimum edge weights in GG, and produces a weighted graph HH with O~(n1+2/k/ϵ2)\tilde{O}(n^{1+2/k}/\epsilon^2) edges that spectrally approximates GG, in the sense of Spielmen and Teng [ST04], up to an error of ϵ\epsilon. Our algorithm is based on a new bounded-independence analysis of Spielman and Srivastava's effective resistance based edge sampling algorithm [SS08] and uses results from recent work on space-bounded Laplacian solvers [MRSV17]. In particular, we demonstrate an inherent tradeoff (via upper and lower bounds) between the amount of (bounded) independence used in the edge sampling algorithm, denoted by kk above, and the resulting sparsity that can be achieved.Comment: 37 page

    Online Row Sampling

    Finding a small spectral approximation for a tall n×dn \times d matrix AA is a fundamental numerical primitive. For a number of reasons, one often seeks an approximation whose rows are sampled from those of AA. Row sampling improves interpretability, saves space when AA is sparse, and preserves row structure, which is especially important, for example, when AA represents a graph. However, correctly sampling rows from AA can be costly when the matrix is large and cannot be stored and processed in memory. Hence, a number of recent publications focus on row sampling in the streaming setting, using little more space than what is required to store the outputted approximation [KL13, KLM+14]. Inspired by a growing body of work on online algorithms for machine learning and data analysis, we extend this work to a more restrictive online setting: we read rows of AA one by one and immediately decide whether each row should be kept in the spectral approximation or discarded, without ever retracting these decisions. We present an extremely simple algorithm that approximates AA up to multiplicative error ϵ\epsilon and additive error δ\delta using O(dlogdlog(ϵA2/δ)/ϵ2)O(d \log d \log(\epsilon||A||_2/\delta)/\epsilon^2) online samples, with memory overhead proportional to the cost of storing the spectral approximation. We also present an algorithm that uses O(d2O(d^2) memory but only requires O(dlog(ϵA2/δ)/ϵ2)O(d\log(\epsilon||A||_2/\delta)/\epsilon^2) samples, which we show is optimal. Our methods are clean and intuitive, allow for lower memory usage than prior work, and expose new theoretical properties of leverage score based matrix approximation

    Filtering Random Graph Processes Over Random Time-Varying Graphs

    Graph filters play a key role in processing the graph spectra of signals supported on the vertices of a graph. However, despite their widespread use, graph filters have been analyzed only in the deterministic setting, ignoring the impact of stochastic- ity in both the graph topology as well as the signal itself. To bridge this gap, we examine the statistical behavior of the two key filter types, finite impulse response (FIR) and autoregressive moving average (ARMA) graph filters, when operating on random time- varying graph signals (or random graph processes) over random time-varying graphs. Our analysis shows that (i) in expectation, the filters behave as the same deterministic filters operating on a deterministic graph, being the expected graph, having as input signal a deterministic signal, being the expected signal, and (ii) there are meaningful upper bounds for the variance of the filter output. We conclude the paper by proposing two novel ways of exploiting randomness to improve (joint graph-time) noise cancellation, as well as to reduce the computational complexity of graph filtering. As demonstrated by numerical results, these methods outperform the disjoint average and denoise algorithm, and yield a (up to) four times complexity redution, with very little difference from the optimal solution

    Tail bounds for all eigenvalues of a sum of random matrices

    This work introduces the minimax Laplace transform method, a modification of the cumulant-based matrix Laplace transform method developed in "User-friendly tail bounds for sums of random matrices" (arXiv:1004.4389v6) that yields both upper and lower bounds on each eigenvalue of a sum of random self-adjoint matrices. This machinery is used to derive eigenvalue analogues of the classical Chernoff, Bennett, and Bernstein bounds. Two examples demonstrate the efficacy of the minimax Laplace transform. The first concerns the effects of column sparsification on the spectrum of a matrix with orthonormal rows. Here, the behavior of the singular values can be described in terms of coherence-like quantities. The second example addresses the question of relative accuracy in the estimation of eigenvalues of the covariance matrix of a random process. Standard results on the convergence of sample covariance matrices provide bounds on the number of samples needed to obtain relative accuracy in the spectral norm, but these results only guarantee relative accuracy in the estimate of the maximum eigenvalue. The minimax Laplace transform argument establishes that if the lowest eigenvalues decay sufficiently fast, on the order of (K^2*r*log(p))/eps^2 samples, where K is the condition number of an optimal rank-r approximation to C, are sufficient to ensure that the dominant r eigenvalues of the covariance matrix of a N(0, C) random vector are estimated to within a factor of 1+-eps with high probability.Comment: 20 pages, 1 figure, see also arXiv:1004.4389v

    Domain Sparsification of Discrete Distributions Using Entropic Independence

    We present a framework for speeding up the time it takes to sample from discrete distributions ? defined over subsets of size k of a ground set of n elements, in the regime where k is much smaller than n. We show that if one has access to estimates of marginals P_{S? ?} {i ? S}, then the task of sampling from ? can be reduced to sampling from related distributions ? supported on size k subsets of a ground set of only n^{1-?}? poly(k) elements. Here, 1/? ? [1, k] is the parameter of entropic independence for ?. Further, our algorithm only requires sparsified distributions ? that are obtained by applying a sparse (mostly 0) external field to ?, an operation that for many distributions ? of interest, retains algorithmic tractability of sampling from ?. This phenomenon, which we dub domain sparsification, allows us to pay a one-time cost of estimating the marginals of ?, and in return reduce the amortized cost needed to produce many samples from the distribution ?, as is often needed in upstream tasks such as counting and inference. For a wide range of distributions where ? = ?(1), our result reduces the domain size, and as a corollary, the cost-per-sample, by a poly(n) factor. Examples include monomers in a monomer-dimer system, non-symmetric determinantal point processes, and partition-constrained Strongly Rayleigh measures. Our work significantly extends the reach of prior work of Anari and Derezi?ski who obtained domain sparsification for distributions with a log-concave generating polynomial (corresponding to ? = 1). As a corollary of our new analysis techniques, we also obtain a less stringent requirement on the accuracy of marginal estimates even for the case of log-concave polynomials; roughly speaking, we show that constant-factor approximation is enough for domain sparsification, improving over O(1/k) relative error established in prior work

    Error Bounds for Random Matrix Approximation Schemes

    Randomized matrix sparsification has proven to be a fruitful technique for producing faster algorithms in applications ranging from graph partitioning to semidefinite programming. In the decade or so of research into this technique, the focus has been—with few exceptions—on ensuring the quality of approximation in the spectral and Frobenius norms. For certain graph algorithms, however, the ∞→1 norm may be a more natural measure of performance. This paper addresses the problem of approximating a real matrix A by a sparse random matrix X with respect to several norms. It provides the first results on approximation error in the ∞→1 and ∞→2 norms, and it uses a result of Lata la to study approximation error in the spectral norm. These bounds hold for a reasonable family of random sparsification schemes, those which ensure that the entries of X are independent and average to the corresponding entries of A. Optimality of the ∞→1 and ∞→2 error estimates is established. Concentration results for the three norms hold when the entries of X are uniformly bounded. The spectral error bound is used to predict the performance of several sparsification and quantization schemes that have appeared in the literature; the results are competitive with the performance guarantees given by earlier scheme-specific analyses