Search CORE

148 research outputs found

Near-Quadratic Lower Bounds for Two-Pass Graph Streaming Algorithms

Author: Assadi Sepehr
Raz Ran
Publication venue
Publication date: 02/09/2020
Field of study

We prove that any two-pass graph streaming algorithm for the

s

t

reachability problem in

n

-vertex directed graphs requires near-quadratic space of

n^{2-o(1)}

bits. As a corollary, we also obtain near-quadratic space lower bounds for several other fundamental problems including maximum bipartite matching and (approximate) shortest path in undirected graphs. Our results collectively imply that a wide range of graph problems admit essentially no non-trivial streaming algorithm even when two passes over the input is allowed. Prior to our work, such impossibility results were only known for single-pass streaming algorithms, and the best two-pass lower bounds only ruled out

o(n^{7/6})

space algorithms, leaving open a large gap between (trivial) upper bounds and lower bounds

arXiv.org e-Print Archive

Finding heavy hitters from lossy or noisy data

Author: Cody Murray
Lucia Batman
Ramamohan Paturi
Russell Impagliazzo
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2013
Field of study

Abstract. Motivated by Dvir et al. and Wigderson and Yehudayoff [3

CiteSeerX

Tight Bounds for Sketching the Operator Norm, Schatten Norms, and Subspace Embeddings

Author: Li Yi
Woodruff David P.
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2016)
Publication date: 01/01/2016
Field of study

We consider the following oblivious sketching problem: given epsilon in (0,1/3) and n >= d/epsilon^2, design a distribution D over R^{k * nd} and a function f: R^k * R^{nd} -> R}, so that for any n * d matrix A, Pr_{S sim D} [(1-epsilon) |A|_{op} = 2/3, where |A|_{op} = sup_{x:|x|_2 = 1} |Ax|_2 is the operator norm of A and S(A) denotes S * A, interpreting A as a vector in R^{nd}. We show a tight lower bound of k = Omega(d^2/epsilon^2) for this problem. Previously, Nelson and Nguyen (ICALP, 2014) considered the problem of finding a distribution D over R^{k * n} such that for any n * d matrix A, Pr_{S sim D}[forall x, (1-epsilon)|Ax|_2 = 2/3, which is called an oblivious subspace embedding (OSE). Our result considerably strengthens theirs, as it (1) applies only to estimating the operator norm, which can be estimated given any OSE, and (2) applies to distributions over general linear operators S which treat A as a vector and compute S(A), rather than the restricted class of linear operators corresponding to matrix multiplication. Our technique also implies the first tight bounds for approximating the Schatten p-norm for even integers p via general linear sketches, improving the previous lower bound from k = Omega(n^{2-6/p}) [Regev, 2014] to k = Omega(n^{2-4/p}). Importantly, for sketching the operator norm up to a factor of alpha, where alpha - 1 = Omega(1), we obtain a tight k = Omega(n^2/alpha^4) bound, matching the upper bound of Andoni and Nguyen (SODA, 2013), and improving the previous k = Omega(n^2/alpha^6) lower bound. Finally, we also obtain the first lower bounds for approximating Ky Fan norms

Dagstuhl Research Online Publication Server

Asymptotically Optimal Bounds for Estimating H-Index in Sublinear Time with Applications to Subgraph Counting

Author: Assadi Sepehr
Nguyen Hoai-An
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2022)
Publication date: 01/01/2022
Field of study

The

h

-index is a metric used to measure the impact of a user in a publication setting, such as a member of a social network with many highly liked posts or a researcher in an academic domain with many highly cited publications. Specifically, the

h

-index of a user is the largest integer

h

such that at least

h

publications of the user have at least

h

units of positive feedback. We design an algorithm that, given query access to the

n

publications of a user and each publication's corresponding positive feedback number, outputs a

(1\pm \varepsilon)

-approximation of the

h

-index of this user with probability at least

1-\delta

in time

O(\frac{n \cdot \ln{(1/\delta)}}{\varepsilon^2 \cdot h}),

where

h

is the actual

h

-index which is unknown to the algorithm a-priori. We then design a novel lower bound technique that allows us to prove that this bound is in fact asymptotically optimal for this problem in all parameters

n,h,\varepsilon,

and

\delta

. Our work is one of the first in sublinear time algorithms that addresses obtaining asymptotically optimal bounds, especially in terms of the error and confidence parameters. As such, we focus on designing novel techniques for this task. In particular, our lower bound technique seems quite general -- to showcase this, we also use our approach to prove an asymptotically optimal lower bound for the problem of estimating the number of triangles in a graph in sublinear time, which now is also optimal in the error and confidence parameters. This result improves upon prior lower bounds of Eden, Levi, Ron, and Seshadhri (FOCS'15) for this problem, as well as multiple follow-ups that extended this lower bound to other subgraph counting problems.Comment: Full version of the paper accepted to APPROX 202

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Approximating CSPs with Outliers

Author: Ghoshal Suprovat
Louis Anand
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2022)
Publication date: 01/01/2022
Field of study

Dagstuhl Research Online Publication Server

Asymptotically Optimal Bounds for Estimating H-Index in Sublinear Time with Applications to Subgraph Counting

Author: Assadi Sepehr
Nguyen Hoai-An
Publication venue
Publication date: 16/09/2022
Field of study

The

h

h

-index of a user is the largest integer

h

such that at least

h

publications of the user have at least

h

units of positive feedback. We design an algorithm that, given query access to the

n

publications of a user and each publication's corresponding positive feedback number, outputs a

(1\pm \varepsilon)

-approximation of the

h

-index of this user with probability at least

1-\delta

in time

O(\frac{n \cdot \ln{(1/\delta)}}{\varepsilon^2 \cdot h}),

where

h

is the actual

h

n,h,\varepsilon,

and

\delta

arXiv.org e-Print Archive

Tight Bounds for Graph Problems in Insertion Streams

Author: Sun Xiaoming
Woodruff David P.
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2015)
Publication date: 01/01/2015
Field of study

Despite the large amount of work on solving graph problems in the data stream model, there do not exist tight space bounds for almost any of them, even in a stream with only edge insertions. For example, for testing connectivity, the upper bound is O(n * log(n)) bits, while the lower bound is only Omega(n) bits. We remedy this situation by providing the first tight Omega(n * log(n)) space lower bounds for randomized algorithms which succeed with constant probability in a stream of edge insertions for a number of graph problems. Our lower bounds apply to testing bipartiteness, connectivity, cycle-freeness, whether a graph is Eulerian, planarity, H-minor freeness, finding a minimum spanning tree of a connected graph, and testing if the diameter of a sparse graph is constant. We also give the first Omega(n * k * log(n)) space lower bounds for deterministic algorithms for k-edge connectivity and k-vertex connectivity; these are optimal in light of known deterministic upper bounds (for k-vertex connectivity we also need to allow edge duplications, which known upper bounds allow). Finally, we give an Omega(n * log^2(n)) lower bound for randomized algorithms approximating the minimum cut up to a constant factor with constant probability in a graph with integer weights between 1 and n, presented as a stream of insertions and deletions to its edges. This lower bound also holds for cut sparsifiers, and gives the first separation of maintaining a sparsifier in the data stream model versus the offline model

Dagstuhl Research Online Publication Server