    Improved Algorithms for Adaptive Compressed Sensing

    In the problem of adaptive compressed sensing, one wants to estimate an approximately k-sparse vector x in R^n from m linear measurements A_1 x, A_2 x,..., A_m x, where A_i can be chosen based on the outcomes A_1 x,..., A_{i-1} x of previous measurements. The goal is to output a vector x^ for which |x-x^|_p 0 is an approximation factor. Indyk, Price and Woodruff (FOCS\u2711) gave an algorithm for p=q=2 for C = 1+epsilon with O((k/epsilon) loglog (n/k)) measurements and O(log^*(k) loglog (n)) rounds of adaptivity. We first improve their bounds, obtaining a scheme with O(k * loglog (n/k) + (k/epsilon) * loglog(1/epsilon)) measurements and O(log^*(k) loglog (n)) rounds, as well as a scheme with O((k/epsilon) * loglog (n log (n/k))) measurements and an optimal O(loglog (n)) rounds. We then provide novel adaptive compressed sensing schemes with improved bounds for (p,p) for every 0 < p < 2. We show that the improvement from O(k log(n/k)) measurements to O(k log log (n/k)) measurements in the adaptive setting can persist with a better epsilon-dependence for other values of p and q. For example, when (p,q) = (1,1), we obtain O(k/sqrt{epsilon} * log log n log^3 (1/epsilon)) measurements. We obtain nearly matching lower bounds, showing our algorithms are close to optimal. Along the way, we also obtain the first nearly-optimal bounds for (p,p) schemes for every 0 < p < 2 even in the non-adaptive setting

    Querying a Matrix Through Matrix-Vector Products

    We consider algorithms with access to an unknown matrix M in F^{n x d} via matrix-vector products, namely, the algorithm chooses vectors v^1, ..., v^q, and observes Mv^1, ..., Mv^q. Here the v^i can be randomized as well as chosen adaptively as a function of Mv^1, ..., Mv^{i-1}. Motivated by applications of sketching in distributed computation, linear algebra, and streaming models, as well as connections to areas such as communication complexity and property testing, we initiate the study of the number q of queries needed to solve various fundamental problems. We study problems in three broad categories, including linear algebra, statistics problems, and graph problems. For example, we consider the number of queries required to approximate the rank, trace, maximum eigenvalue, and norms of a matrix M; to compute the AND/OR/Parity of each column or row of M, to decide whether there are identical columns or rows in M or whether M is symmetric, diagonal, or unitary; or to compute whether a graph defined by M is connected or triangle-free. We also show separations for algorithms that are allowed to obtain matrix-vector products only by querying vectors on the right, versus algorithms that can query vectors on both the left and the right. We also show separations depending on the underlying field the matrix-vector product occurs in. For graph problems, we show separations depending on the form of the matrix (bipartite adjacency versus signed edge-vertex incidence matrix) to represent the graph. Surprisingly, this fundamental model does not appear to have been studied on its own, and we believe a thorough investigation of problems in this model would be beneficial to a number of different application areas

    Space Optimal Vertex Cover in Dynamic Streams

    Asymptotically Optimal Bounds for Estimating H-Index in Sublinear Time with Applications to Subgraph Counting

    The hh-index is a metric used to measure the impact of a user in a publication setting, such as a member of a social network with many highly liked posts or a researcher in an academic domain with many highly cited publications. Specifically, the hh-index of a user is the largest integer hh such that at least hh publications of the user have at least hh units of positive feedback. We design an algorithm that, given query access to the nn publications of a user and each publication's corresponding positive feedback number, outputs a (1±ε)(1\pm \varepsilon)-approximation of the hh-index of this user with probability at least 1δ1-\delta in time O(nln(1/δ)ε2h), O(\frac{n \cdot \ln{(1/\delta)}}{\varepsilon^2 \cdot h}), where hh is the actual hh-index which is unknown to the algorithm a-priori. We then design a novel lower bound technique that allows us to prove that this bound is in fact asymptotically optimal for this problem in all parameters n,h,ε,n,h,\varepsilon, and δ\delta. Our work is one of the first in sublinear time algorithms that addresses obtaining asymptotically optimal bounds, especially in terms of the error and confidence parameters. As such, we focus on designing novel techniques for this task. In particular, our lower bound technique seems quite general -- to showcase this, we also use our approach to prove an asymptotically optimal lower bound for the problem of estimating the number of triangles in a graph in sublinear time, which now is also optimal in the error and confidence parameters. This result improves upon prior lower bounds of Eden, Levi, Ron, and Seshadhri (FOCS'15) for this problem, as well as multiple follow-ups that extended this lower bound to other subgraph counting problems.Comment: Full version of the paper accepted to APPROX 202