Search CORE

96,527 research outputs found

Coresets Meet EDCS: Algorithms for Matching and Vertex Cover on Massive Graphs

Author: Assadi Sepehr
Bateni MohammadHossein
Bernstein Aaron
Mirrokni Vahab
Stein Cliff
Publication venue
Publication date: 27/12/2018
Field of study

As massive graphs become more prevalent, there is a rapidly growing need for scalable algorithms that solve classical graph problems, such as maximum matching and minimum vertex cover, on large datasets. For massive inputs, several different computational models have been introduced, including the streaming model, the distributed communication model, and the massively parallel computation (MPC) model that is a common abstraction of MapReduce-style computation. In each model, algorithms are analyzed in terms of resources such as space used or rounds of communication needed, in addition to the more traditional approximation ratio. In this paper, we give a single unified approach that yields better approximation algorithms for matching and vertex cover in all these models. The highlights include: * The first one pass, significantly-better-than-2-approximation for matching in random arrival streams that uses subquadratic space, namely a

(1.5+\epsilon)

-approximation streaming algorithm that uses

O(n^{1.5})

space for constant

\epsilon > 0

. * The first 2-round, better-than-2-approximation for matching in the MPC model that uses subquadratic space per machine, namely a

(1.5+\epsilon)

-approximation algorithm with

O(\sqrt{mn} + n)

memory per machine for constant

\epsilon > 0

. By building on our unified approach, we further develop parallel algorithms in the MPC model that give a

(1 + \epsilon)

-approximation to matching and an

O(1)

-approximation to vertex cover in only

O(\log\log{n})

MPC rounds and

O(n/poly\log{(n)})

memory per machine. These results settle multiple open questions posed in the recent paper of Czumaj~et.al. [STOC 2018]

arXiv.org e-Print Archive

Crossref

Densest Subgraph in Dynamic Graph Streams

Author: A McGregor
AC Gilbert
B Bahmani
B Bahmani
G Cormode
G Cormode
G Gallo
K Kutzkov
KJ Ahn
M Charikar
M Mitzenmacher
S Khuller
V Lee
Publication venue
Publication date: 14/06/2015
Field of study

In this paper, we consider the problem of approximating the densest subgraph in the dynamic graph stream model. In this model of computation, the input graph is defined by an arbitrary sequence of edge insertions and deletions and the goal is to analyze properties of the resulting graph given memory that is sub-linear in the size of the stream. We present a single-pass algorithm that returns a

(1+\epsilon)

approximation of the maximum density with high probability; the algorithm uses O(\epsilon^{-2} n \polylog n) space, processes each stream update in \polylog (n) time, and uses \poly(n) post-processing time where

n

is the number of nodes. The space used by our algorithm matches the lower bound of Bahmani et al.~(PVLDB 2012) up to a poly-logarithmic factor for constant

\epsilon

. The best existing results for this problem were established recently by Bhattacharya et al.~(STOC 2015). They presented a

(2+\epsilon)

approximation algorithm using similar space and another algorithm that both processed each update and maintained a

(4+\epsilon)

approximation of the current maximum density in \polylog (n) time per-update.Comment: To appear in MFCS 201

arXiv.org e-Print Archive

Crossref

Optimal lower bounds for universal relation, and for samplers and finding duplicates in streams

Author: Kapralov Michael
Nelson Jelani
Pachocki Jakub
Wang Zhengyu
Woodruff David P.
Yahyazadeh Mobin
Publication venue
Publication date: 03/04/2017
Field of study

In the communication problem

\mathbf{UR}

(universal relation) [KRW95], Alice and Bob respectively receive

x, y \in\{0,1\}^n

with the promise that

x\neq y

. The last player to receive a message must output an index

i

such that

x_i\neq y_i

. We prove that the randomized one-way communication complexity of this problem in the public coin model is exactly

\Theta(\min\{n,\log(1/\delta)\log^2(\frac n{\log(1/\delta)})\})

for failure probability

\delta

. Our lower bound holds even if promised

\mathop{support}(y)\subset \mathop{support}(x)

. As a corollary, we obtain optimal lower bounds for

\ell_p

-sampling in strict turnstile streams for

0\le p < 2

, as well as for the problem of finding duplicates in a stream. Our lower bounds do not need to use large weights, and hold even if promised

x\in\{0,1\}^n

at all points in the stream. We give two different proofs of our main result. The first proof demonstrates that any algorithm

\mathcal A

solving sampling problems in turnstile streams in low memory can be used to encode subsets of

[n]

of certain sizes into a number of bits below the information theoretic minimum. Our encoder makes adaptive queries to

\mathcal A

throughout its execution, but done carefully so as to not violate correctness. This is accomplished by injecting random noise into the encoder's interactions with

\mathcal A

, which is loosely motivated by techniques in differential privacy. Our second proof is via a novel randomized reduction from Augmented Indexing [MNSW98] which needs to interact with

\mathcal A

adaptively. To handle the adaptivity we identify certain likely interaction patterns and union bound over them to guarantee correct interaction on all of them. To guarantee correctness, it is important that the interaction hides some of its randomness from

\mathcal A

in the reduction.Comment: merge of arXiv:1703.08139 and of work of Kapralov, Woodruff, and Yahyazade

arXiv.org e-Print Archive

Crossref