148 research outputs found
Near-Quadratic Lower Bounds for Two-Pass Graph Streaming Algorithms
We prove that any two-pass graph streaming algorithm for the -
reachability problem in -vertex directed graphs requires near-quadratic
space of bits. As a corollary, we also obtain near-quadratic space
lower bounds for several other fundamental problems including maximum bipartite
matching and (approximate) shortest path in undirected graphs.
Our results collectively imply that a wide range of graph problems admit
essentially no non-trivial streaming algorithm even when two passes over the
input is allowed. Prior to our work, such impossibility results were only known
for single-pass streaming algorithms, and the best two-pass lower bounds only
ruled out space algorithms, leaving open a large gap between
(trivial) upper bounds and lower bounds
Finding heavy hitters from lossy or noisy data
Abstract. Motivated by Dvir et al. and Wigderson and Yehudayoff [3
Tight Bounds for Sketching the Operator Norm, Schatten Norms, and Subspace Embeddings
We consider the following oblivious sketching problem: given epsilon in (0,1/3) and n >= d/epsilon^2, design a distribution D over R^{k * nd} and a function f: R^k * R^{nd} -> R}, so that for any n * d matrix A, Pr_{S sim D} [(1-epsilon) |A|_{op} = 2/3, where |A|_{op} = sup_{x:|x|_2 = 1} |Ax|_2 is the operator norm of A and S(A) denotes S * A, interpreting A as a vector in R^{nd}. We show a tight lower bound of k = Omega(d^2/epsilon^2) for this problem. Previously, Nelson and Nguyen (ICALP, 2014) considered the problem of finding a distribution D over R^{k * n} such that for any n * d matrix A, Pr_{S sim D}[forall x, (1-epsilon)|Ax|_2 = 2/3, which is called an oblivious subspace embedding (OSE). Our result considerably strengthens theirs, as it (1) applies only to estimating the operator norm, which can be estimated given any OSE, and (2) applies to distributions over general linear operators S which treat A as a vector and compute S(A), rather than the restricted class of linear operators corresponding to matrix multiplication. Our technique also implies the first tight bounds for approximating the Schatten p-norm for even integers p via general linear sketches, improving the previous lower bound from k = Omega(n^{2-6/p}) [Regev, 2014] to k = Omega(n^{2-4/p}). Importantly, for sketching the operator norm up to a factor of alpha, where alpha - 1 = Omega(1), we obtain a tight k = Omega(n^2/alpha^4) bound, matching the upper bound of Andoni and Nguyen (SODA, 2013), and improving the previous k = Omega(n^2/alpha^6) lower bound. Finally, we also obtain the first lower bounds for approximating Ky Fan norms
Asymptotically Optimal Bounds for Estimating H-Index in Sublinear Time with Applications to Subgraph Counting
The -index is a metric used to measure the impact of a user in a
publication setting, such as a member of a social network with many highly
liked posts or a researcher in an academic domain with many highly cited
publications. Specifically, the -index of a user is the largest integer
such that at least publications of the user have at least units of
positive feedback.
We design an algorithm that, given query access to the publications of a
user and each publication's corresponding positive feedback number, outputs a
-approximation of the -index of this user with
probability at least in time where is the actual -index which is unknown to the algorithm
a-priori. We then design a novel lower bound technique that allows us to prove
that this bound is in fact asymptotically optimal for this problem in all
parameters and .
Our work is one of the first in sublinear time algorithms that addresses
obtaining asymptotically optimal bounds, especially in terms of the error and
confidence parameters. As such, we focus on designing novel techniques for this
task. In particular, our lower bound technique seems quite general -- to
showcase this, we also use our approach to prove an asymptotically optimal
lower bound for the problem of estimating the number of triangles in a graph in
sublinear time, which now is also optimal in the error and confidence
parameters. This result improves upon prior lower bounds of Eden, Levi, Ron,
and Seshadhri (FOCS'15) for this problem, as well as multiple follow-ups that
extended this lower bound to other subgraph counting problems.Comment: Full version of the paper accepted to APPROX 202
Asymptotically Optimal Bounds for Estimating H-Index in Sublinear Time with Applications to Subgraph Counting
The -index is a metric used to measure the impact of a user in a
publication setting, such as a member of a social network with many highly
liked posts or a researcher in an academic domain with many highly cited
publications. Specifically, the -index of a user is the largest integer
such that at least publications of the user have at least units of
positive feedback.
We design an algorithm that, given query access to the publications of a
user and each publication's corresponding positive feedback number, outputs a
-approximation of the -index of this user with
probability at least in time where is the actual -index which is unknown to the algorithm
a-priori. We then design a novel lower bound technique that allows us to prove
that this bound is in fact asymptotically optimal for this problem in all
parameters and .
Our work is one of the first in sublinear time algorithms that addresses
obtaining asymptotically optimal bounds, especially in terms of the error and
confidence parameters. As such, we focus on designing novel techniques for this
task. In particular, our lower bound technique seems quite general -- to
showcase this, we also use our approach to prove an asymptotically optimal
lower bound for the problem of estimating the number of triangles in a graph in
sublinear time, which now is also optimal in the error and confidence
parameters. This result improves upon prior lower bounds of Eden, Levi, Ron,
and Seshadhri (FOCS'15) for this problem, as well as multiple follow-ups that
extended this lower bound to other subgraph counting problems.Comment: Full version of the paper accepted to APPROX 202
Tight Bounds for Graph Problems in Insertion Streams
Despite the large amount of work on solving graph problems in the data stream model, there do not exist tight space bounds for almost any of them, even in a stream with only edge insertions. For example, for testing connectivity, the upper bound is O(n * log(n)) bits, while the lower bound is only Omega(n) bits. We remedy this situation by providing the first tight Omega(n * log(n)) space lower bounds for randomized algorithms which succeed with constant probability in a stream of edge insertions for a number of graph problems. Our lower bounds apply to testing bipartiteness, connectivity, cycle-freeness, whether a graph is Eulerian, planarity, H-minor freeness, finding a minimum spanning tree of a connected graph, and testing if the diameter of a sparse graph is constant. We also give the first Omega(n * k * log(n)) space lower bounds for deterministic algorithms for k-edge connectivity and k-vertex connectivity; these are optimal in light of known deterministic upper bounds (for k-vertex connectivity we also need to allow edge duplications, which known upper bounds allow). Finally, we give an Omega(n * log^2(n)) lower bound for randomized algorithms approximating the minimum cut up to a constant factor with constant probability in a graph with integer weights between 1 and n, presented as a stream of insertions and deletions to its edges. This lower bound also holds for cut sparsifiers, and gives the first separation of maintaining a sparsifier in the data stream model versus the offline model
- …