Search CORE

3,441 research outputs found

Tight Space-Approximation Tradeoff for the Multi-Pass Streaming Set Cover Problem

Author: Assadi Sepehr
Publication venue
Publication date: 06/03/2017
Field of study

We study the classic set cover problem in the streaming model: the sets that comprise the instance are revealed one by one in a stream and the goal is to solve the problem by making one or few passes over the stream while maintaining a sublinear space

o(mn)

in the input size; here

m

denotes the number of the sets and

n

is the universe size. Notice that in this model, we are mainly concerned with the space requirement of the algorithms and hence do not restrict their computation time. Our main result is a resolution of the space-approximation tradeoff for the streaming set cover problem: we show that any

\alpha

-approximation algorithm for the set cover problem requires

\widetilde{\Omega}(mn^{1/\alpha})

space, even if it is allowed polylog

{(n)}

passes over the stream, and even if the sets are arriving in a random order in the stream. This space-approximation tradeoff matches the best known bounds achieved by the recent algorithm of Har-Peled et.al. (PODS 2016) that requires only

O(\alpha)

passes over the stream in an adversarial order, hence settling the space complexity of approximating the set cover problem in data streams in a quite robust manner. Additionally, our approach yields tight lower bounds for the space complexity of

(1- \epsilon)

-approximating the streaming maximum coverage problem studied in several recent works

arXiv.org e-Print Archive

Simple Round Compression for Parallel Vertex Cover

Author: Assadi Sepehr
Publication venue
Publication date: 13/09/2017
Field of study

Recently, Czumaj et.al. (arXiv 2017) presented a parallel (almost)

2

-approximation algorithm for the maximum matching problem in only

O({(\log\log{n})^2})

rounds of the massive parallel computation (MPC) framework, when the memory per machine is

O(n)

. The main approach in their work is a way of compressing

O(\log{n})

rounds of a distributed algorithm for maximum matching into only

O({(\log\log{n})^2})

MPC rounds. In this note, we present a similar algorithm for the closely related problem of approximating the minimum vertex cover in the MPC framework. We show that one can achieve an

O(\log{n})

approximation to minimum vertex cover in only

O(\log\log{n})

MPC rounds when the memory per machine is

O(n)

. Our algorithm for vertex cover is similar to the maximum matching algorithm of Czumaj et.al. but avoids many of the intricacies in their approach and as a result admits a considerably simpler analysis (at a cost of a worse approximation guarantee). We obtain this result by modifying a previous parallel algorithm by Khanna and the author (SPAA 2017) for vertex cover that allowed for compressing

O(\log{n})

rounds of a distributed algorithm into constant MPC rounds when the memory allowed per machine is

O(n\sqrt{n})

arXiv.org e-Print Archive

Randomized Composable Coresets for Matching and Vertex Cover

Author: Assadi Sepehr
Khanna Sanjeev
Publication venue
Publication date: 23/05/2017
Field of study

A common approach for designing scalable algorithms for massive data sets is to distribute the computation across, say

k

, machines and process the data using limited communication between them. A particularly appealing framework here is the simultaneous communication model whereby each machine constructs a small representative summary of its own data and one obtains an approximate/exact solution from the union of the representative summaries. If the representative summaries needed for a problem are small, then this results in a communication-efficient and round-optimal protocol. While many fundamental graph problems admit efficient solutions in this model, two prominent problems are notably absent from the list of successes, namely, the maximum matching problem and the minimum vertex cover problem. Indeed, it was shown recently that for both these problems, even achieving a polylog

(n)

approximation requires essentially sending the entire input graph from each machine. The main insight of our work is that the intractability of matching and vertex cover in the simultaneous communication model is inherently connected to an adversarial partitioning of the underlying graph across machines. We show that when the underlying graph is randomly partitioned across machines, both these problems admit randomized composable coresets of size

\widetilde{O}(n)

that yield an

\widetilde{O}(1)

-approximate solution. This results in an

\widetilde{O}(1)

-approximation simultaneous protocol for these problems with

\widetilde{O}(nk)

total communication when the input is randomly partitioned across

k

machines. We further prove the optimality of our results. Finally, by a standard application of composable coresets, our results also imply MapReduce algorithms with the same approximation guarantee in one or two rounds of communicatio

arXiv.org e-Print Archive

On Decidability of the Ordered Structures of Numbers

Author: Assadi Ziba
Salehi Saeed
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 14/02/2018
Field of study

The ordered structures of natural, integer, rational and real numbers are studied here. It is known that the theories of these numbers in the language of order are decidable and finitely axiomatizable. Also, their theories in the language of order and addition are decidable and infinitely axiomatizable. For the language of order and multiplication, it is known that the theories of

\mathbb{N}

and

\mathbb{Z}

are not decidable (and so not axiomatizable by any computably enumerable set of sentences). By Tarski's theorem, the multiplicative ordered structure of

\mathbb{R}

is decidable also; here we prove this result directly and present an axiomatization. The structure of

\mathbb{Q}

in the language of order and multiplication seems to be missing in the literature; here we show the decidability of its theory by the technique of quantifier elimination and after presenting an infinite axiomatization for this structure we prove that it is not finitely axiomatizable.Comment: 17 page

arXiv.org e-Print Archive

Polynomial Pass Lower Bounds for Graph Streaming Algorithms

Author: Assadi Sepehr
Chen Yu
Khanna Sanjeev
Publication venue
Publication date: 09/04/2019
Field of study

We present new lower bounds that show that a polynomial number of passes are necessary for solving some fundamental graph problems in the streaming model of computation. For instance, we show that any streaming algorithm that finds a weighted minimum

s

t

cut in an

n

-vertex undirected graph requires

n^{2-o(1)}

space unless it makes

n^{\Omega(1)}

passes over the stream. To prove our lower bounds, we introduce and analyze a new four-player communication problem that we refer to as the hidden-pointer chasing problem. This is a problem in spirit of the standard pointer chasing problem with the key difference that the pointers in this problem are hidden to players and finding each one of them requires solving another communication problem, namely the set intersection problem. Our lower bounds for graph problems are then obtained by reductions from the hidden-pointer chasing problem. Our hidden-pointer chasing problem appears flexible enough to find other applications and is therefore interesting in its own right. To showcase this, we further present an interesting application of this problem beyond streaming algorithms. Using a reduction from hidden-pointer chasing, we prove that any algorithm for submodular function minimization needs to make

n^{2-o(1)}

value queries to the function unless it has a polynomial degree of adaptivity

arXiv.org e-Print Archive

Tight Bounds for Single-Pass Streaming Complexity of the Set Cover Problem

Author: Assadi Sepehr
Khanna Sanjeev
Li Yang
Publication venue
Publication date: 17/03/2016
Field of study

We resolve the space complexity of single-pass streaming algorithms for approximating the classic set cover problem. For finding an

\alpha

-approximate set cover (for any

\alpha= o(\sqrt{n})

) using a single-pass streaming algorithm, we show that

\Theta(mn/\alpha)

space is both sufficient and necessary (up to an

O(\log{n})

factor); here

m

denotes number of the sets and

n

denotes size of the universe. This provides a strong negative answer to the open question posed by Indyk et al. (2015) regarding the possibility of having a single-pass algorithm with a small approximation factor that uses sub-linear space. We further study the problem of estimating the size of a minimum set cover (as opposed to finding the actual sets), and establish that an additional factor of

\alpha

saving in the space is achievable in this case and that this is the best possible. In other words, we show that

\Theta(mn/\alpha^2)

space is both sufficient and necessary (up to logarithmic factors) for estimating the size of a minimum set cover to within a factor of

\alpha

. Our algorithm in fact works for the more general problem of estimating the optimal value of a covering integer program. On the other hand, our lower bound holds even for set cover instances where the sets are presented in a random order

arXiv.org e-Print Archive

The Stochastic Matching Problem: Beating Half with a Non-Adaptive Algorithm

Author: Assadi Sepehr
Khanna Sanjeev
Li Yang
Publication venue
Publication date: 05/05/2017
Field of study

In the stochastic matching problem, we are given a general (not necessarily bipartite) graph

G(V,E)

, where each edge in

E

is realized with some constant probability

p > 0

and the goal is to compute a bounded-degree (bounded by a function depending only on

p

) subgraph

H

G

such that the expected maximum matching size in

H

is close to the expected maximum matching size in

G

. The algorithms in this setting are considered non-adaptive as they have to choose the subgraph

H

without knowing any information about the set of realized edges in

G

. Originally motivated by an application to kidney exchange, the stochastic matching problem and its variants have received significant attention in recent years. The state-of-the-art non-adaptive algorithms for stochastic matching achieve an approximation ratio of

\frac{1}{2}-\epsilon

for any

\epsilon > 0

, naturally raising the question that if

1/2

is the limit of what can be achieved with a non-adaptive algorithm. In this work, we resolve this question by presenting the first algorithm for stochastic matching with an approximation guarantee that is strictly better than

1/2

: the algorithm computes a subgraph

H

G

with the maximum degree

O(\frac{\log{(1/ p)}}{p})

such that the ratio of expected size of a maximum matching in realizations of

H

and

G

is at least

1/2+\delta_0

for some absolute constant

\delta_0 > 0

. The degree bound on

H

achieved by our algorithm is essentially the best possible (up to an

O(\log{(1/p)})

factor) for any constant factor approximation algorithm, since an

\Omega(\frac{1}{p})

degree in

H

is necessary for a vertex to acquire at least one incident edge in a realization

arXiv.org e-Print Archive

Online Assignment of Heterogeneous Tasks in Crowdsourcing Markets

Author: Assadi Sepehr
Hsu Justin
Jabbari Shahin
Publication venue
Publication date: 14/08/2015
Field of study

We investigate the problem of heterogeneous task assignment in crowdsourcing markets from the point of view of the requester, who has a collection of tasks. Workers arrive online one by one, and each declare a set of feasible tasks they can solve, and desired payment for each feasible task. The requester must decide on the fly which task (if any) to assign to the worker, while assigning workers only to feasible tasks. The goal is to maximize the number of assigned tasks with a fixed overall budget. We provide an online algorithm for this problem and prove an upper bound on the competitive ratio of this algorithm against an arbitrary (possibly worst-case) sequence of workers who want small payments relative to the requester's total budget. We further show an almost matching lower bound on the competitive ratio of any algorithm in this setting. Finally, we propose a different algorithm that achieves an improved competitive ratio in the random permutation model, where the order of arrival of the workers is chosen uniformly at random. Apart from these strong theoretical guarantees, we carry out experiments on simulated data which demonstrates the practical applicability of our algorithms.Comment: Extended version of paper in HCOMP 201

arXiv.org e-Print Archive

Distributed and Streaming Linear Programming in Low Dimensions

Author: Assadi Sepehr
Karpov Nikolai
Zhang Qin
Publication venue
Publication date: 13/03/2019
Field of study

We study linear programming and general LP-type problems in several big data (streaming and distributed) models. We mainly focus on low dimensional problems in which the number of constraints is much larger than the number of variables. Low dimensional LP-type problems appear frequently in various machine learning tasks such as robust regression, support vector machines, and core vector machines. As supporting large-scale machine learning queries in database systems has become an important direction for database research, obtaining efficient algorithms for low dimensional LP-type problems on massive datasets is of great value. In this paper we give both upper and lower bounds for LP-type problems in distributed and streaming models. Our bounds are almost tight when the dimensionality of the problem is a fixed constant.Comment: To appear in PODS'19; 28 page

arXiv.org e-Print Archive

Stochastic Submodular Cover with Limited Adaptivity

Author: Agarwal Arpit
Assadi Sepehr
Khanna Sanjeev
Publication venue
Publication date: 31/10/2018
Field of study

In the submodular cover problem, we are given a non-negative monotone submodular function

f

over a ground set

E

of items, and the goal is to choose a smallest subset

S \subseteq E

such that

f(S) = Q

where

Q = f(E)

. In the stochastic version of the problem, we are given

m

stochastic items which are different random variables that independently realize to some item in

E

, and the goal is to find a smallest set of stochastic items whose realization

R

satisfies

f(R) = Q

. The problem captures as a special case the stochastic set cover problem and more generally, stochastic covering integer programs. We define an

r

-round adaptive algorithm to be an algorithm that chooses a permutation of all available items in each round

k \in [r]

, and a threshold

\tau_k

, and realizes items in the order specified by the permutation until the function value is at least

\tau_k

. The permutation for each round

k

is chosen adaptively based on the realization in the previous rounds, but the ordering inside each round remains fixed regardless of the realizations seen inside the round. Our main result is that for any integer

r

, there exists a poly-time

r

-round adaptive algorithm for stochastic submodular cover whose expected cost is

\tilde{O}(Q^{{1}/{r}})

times the expected cost of a fully adaptive algorithm. Prior to our work, such a result was not known even for the case of

r=1

and when

f

is the coverage function. On the other hand, we show that for any

r

, there exist instances of the stochastic submodular cover problem where no

r

-round adaptive algorithm can achieve better than

\Omega(Q^{{1}/{r}})

approximation to the expected cost of a fully adaptive algorithm. Our lower bound result holds even for coverage function and for algorithms with unbounded computational power

arXiv.org e-Print Archive