Search CORE

101 research outputs found

Weighted Maximum Independent Set of Geometric Objects in Turnstile Streams

Author: Bakshi Ainesh
Chepurko Nadiia
Woodruff David P.
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2020)
Publication date: 01/01/2020
Field of study

We study the Maximum Independent Set problem for geometric objects given in the data stream model. A set of geometric objects is said to be independent if the objects are pairwise disjoint. We consider geometric objects in one and two dimensions, i.e., intervals and disks. Let

\alpha

be the cardinality of the largest independent set. Our goal is to estimate

\alpha

in a small amount of space, given that the input is received as a one-pass stream. We also consider a generalization of this problem by assigning weights to each object and estimating

\beta

, the largest value of a weighted independent set. We initialize the study of this problem in the turnstile streaming model (insertions and deletions) and provide the first algorithms for estimating

\alpha

and

\beta

. For unit-length intervals, we obtain a

(2+\epsilon)

-approximation to

\alpha

and

\beta

in poly

(\frac{\log(n)}{\epsilon})

space. We also show a matching lower bound. Combined with the

3/2

-approximation for insertion-only streams by Cabello and Perez-Lanterno [CP15], our result implies a separation between the insertion-only and turnstile model. For unit-radius disks, we obtain a

\left(\frac{8\sqrt{3}}{\pi}\right)

-approximation to

\alpha

and

\beta

in poly

(\log(n), \epsilon^{-1})

space, which is closely related to the hexagonal circle packing constant. We provide algorithms for estimating

\alpha

for arbitrary-length intervals under a bounded intersection assumption and study the parameterized space complexity of estimating

\alpha

and

\beta

, where the parameter is the ratio of maximum to minimum interval length.Comment: The lower bound for arbitrary length intervals in the previous version contains a bug, we are updating the submission to reflect thi

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Optimal lower bounds for universal relation, and for samplers and finding duplicates in streams

Author: Kapralov Michael
Nelson Jelani
Pachocki Jakub
Wang Zhengyu
Woodruff David P.
Yahyazadeh Mobin
Publication venue
Publication date: 03/04/2017
Field of study

In the communication problem

\mathbf{UR}

(universal relation) [KRW95], Alice and Bob respectively receive

x, y \in\{0,1\}^n

with the promise that

x\neq y

. The last player to receive a message must output an index

i

such that

x_i\neq y_i

. We prove that the randomized one-way communication complexity of this problem in the public coin model is exactly

\Theta(\min\{n,\log(1/\delta)\log^2(\frac n{\log(1/\delta)})\})

for failure probability

\delta

. Our lower bound holds even if promised

\mathop{support}(y)\subset \mathop{support}(x)

. As a corollary, we obtain optimal lower bounds for

\ell_p

-sampling in strict turnstile streams for

0\le p < 2

, as well as for the problem of finding duplicates in a stream. Our lower bounds do not need to use large weights, and hold even if promised

x\in\{0,1\}^n

at all points in the stream. We give two different proofs of our main result. The first proof demonstrates that any algorithm

\mathcal A

solving sampling problems in turnstile streams in low memory can be used to encode subsets of

[n]

of certain sizes into a number of bits below the information theoretic minimum. Our encoder makes adaptive queries to

\mathcal A

throughout its execution, but done carefully so as to not violate correctness. This is accomplished by injecting random noise into the encoder's interactions with

\mathcal A

, which is loosely motivated by techniques in differential privacy. Our second proof is via a novel randomized reduction from Augmented Indexing [MNSW98] which needs to interact with

\mathcal A

adaptively. To handle the adaptivity we identify certain likely interaction patterns and union bound over them to guarantee correct interaction on all of them. To guarantee correctness, it is important that the interaction hides some of its randomness from

\mathcal A

in the reduction.Comment: merge of arXiv:1703.08139 and of work of Kapralov, Woodruff, and Yahyazade

arXiv.org e-Print Archive

Crossref

Recommended from our members

Streaming Algorithms Via Reductions

Author: Crouch Michael S
Publication venue: ScholarWorks@UMass Amherst
Publication date: 12/11/2014
Field of study

In the streaming algorithms model of computation we must process data in order and without enough memory to remember the entire input. We study reductions between problems in the streaming model with an eye to using reductions as an algorithm design technique. Our contributions include: * Linear Transformation reductions, which compose with existing linear sketch techniques. We use these for small-space algorithms for numeric measurements of distance-from-periodicity, finding the period of a numeric stream, and detecting cyclic shifts. * The first streaming graph algorithms in the sliding window\u27 model, where we must consider only the most recent L elements for some fixed threshold L. We develop basic algorithms for connectivity and unweighted maximum matching, then develop a variety of other algorithms via reductions to these problems. * A new reduction from maximum weighted matching to maximum unweighted matching. This reduction immediately yields improved approximation guarantees for maximum weighted matching in the semistreaming, sliding window, and MapReduce models, and extends to the more general problem of finding maximum independent sets in p-systems. * Algorithms in a stream-of-samples model which exhibit clear sample vs. space tradeoffs. These algorithms are also inspired by examining reductions. We provide algorithms for calculating F_k frequency moments and graph connectivity

ScholarWorks@UMass Amherst

Finding structure in data streams : correlations, independent sets, and matchings

Author: Dark Jacques
Publication venue
Publication date
Field of study

The streaming model supposes that, rather than being available all at once, the data is received in a piecemeal fashion. In a world of massive data sets, streaming algorithms give a complementary approach to distributed algorithms: with the data all being available in one place but at different times, rather than at the same time in different places. We examine three different single-pass streaming problems where existing results show limited feasibility. We consider realistic relaxations or restrictions of these problems which allow for more efficient algorithms. In the correlation outliers problem, we wish to identify pairs of unusually correlated signals from a streamed matrix of observations. We show that a simple application of existing technique is space-optimal but has slow query time when the outlier threshold is small. We demonstrate how we can achieve faster query times at the cost of storing a larger data summary. In the maximum independent set problem, we wish to find an edge-less induced subgraph of maximum size. For arbitrary graphs, given as a stream of edges, it is known that no space-efficient algorithm exists. We consider a variant streaming model, where the graph is received vertex by vertex. While we show this model still does not admit efficient algorithms for general graphs, we demonstrate efficient approximation algorithms for various special graph classes. In the maximum matching problem, we wish to find a disjoint subset of edges of largest possible size. The greedy algorithm gives us an easy 2-approximation for streams of edges, but the problem becomes infeasible to solve if we allow unlimited edge deletions. We consider a model where, instead, a limited number of deletions are allowed. We describe several new approximation algorithms with complexity parameterised by the number of deletions. We also present new techniques which may lead to the development of corresponding tight lower bounds

Warwick Research Archives Portal Repository

Dynamic Approximate Maximum Independent Set of Intervals, Hypercubes and Hyperrectangles

Author: Henzinger Monika
Neumann Stefan
Wiese Andreas
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 36th International Symposium on Computational Geometry (SoCG 2020)
Publication date: 01/01/2020
Field of study

Independent set is a fundamental problem in combinatorial optimization. While in general graphs the problem is essentially inapproximable, for many important graph classes there are approximation algorithms known in the offline setting. These graph classes include interval graphs and geometric intersection graphs, where vertices correspond to intervals/geometric objects and an edge indicates that the two corresponding objects intersect. We present dynamic approximation algorithms for independent set of intervals, hypercubes and hyperrectangles in d dimensions. They work in the fully dynamic model where each update inserts or deletes a geometric object. All our algorithms are deterministic and have worst-case update times that are polylogarithmic for constant d and ?>0, assuming that the coordinates of all input objects are in [0, N]^d and each of their edges has length at least 1. We obtain the following results: - For weighted intervals, we maintain a (1+?)-approximate solution. - For d-dimensional hypercubes we maintain a (1+?)2^d-approximate solution in the unweighted case and a O(2^d)-approximate solution in the weighted case. Also, we show that for maintaining an unweighted (1+?)-approximate solution one needs polynomial update time for d ? 2 if the ETH holds. - For weighted d-dimensional hyperrectangles we present a dynamic algorithm with approximation ratio (1+?)log^{d-1}N

arXiv.org e-Print Archive

VU Research Portal

Dagstuhl Research Online Publication Server

Algorithmic Techniques for Processing Data Streams

Author: Ikonomovska Elena
Zelke Mariano
Publication venue: Dagstuhl Follow-Ups. Data Exchange, Integration, and Streams
Publication date: 01/01/2013
Field of study

We give a survey at some algorithmic techniques for processing data streams. After covering the basic methods of sampling and sketching, we present more evolved procedures that resort on those basic ones. In particular, we examine algorithmic schemes for similarity mining, the concept of group testing, and techniques for clustering and summarizing data streams

Dagstuhl Research Online Publication Server

Coresets Meet EDCS: Algorithms for Matching and Vertex Cover on Massive Graphs

Author: Assadi Sepehr
Bateni MohammadHossein
Bernstein Aaron
Mirrokni Vahab
Stein Cliff
Publication venue
Publication date: 27/12/2018
Field of study

As massive graphs become more prevalent, there is a rapidly growing need for scalable algorithms that solve classical graph problems, such as maximum matching and minimum vertex cover, on large datasets. For massive inputs, several different computational models have been introduced, including the streaming model, the distributed communication model, and the massively parallel computation (MPC) model that is a common abstraction of MapReduce-style computation. In each model, algorithms are analyzed in terms of resources such as space used or rounds of communication needed, in addition to the more traditional approximation ratio. In this paper, we give a single unified approach that yields better approximation algorithms for matching and vertex cover in all these models. The highlights include: * The first one pass, significantly-better-than-2-approximation for matching in random arrival streams that uses subquadratic space, namely a

(1.5+\epsilon)

-approximation streaming algorithm that uses

O(n^{1.5})

space for constant

\epsilon > 0

. * The first 2-round, better-than-2-approximation for matching in the MPC model that uses subquadratic space per machine, namely a

(1.5+\epsilon)

-approximation algorithm with

O(\sqrt{mn} + n)

memory per machine for constant

\epsilon > 0

. By building on our unified approach, we further develop parallel algorithms in the MPC model that give a

(1 + \epsilon)

-approximation to matching and an

O(1)

-approximation to vertex cover in only

O(\log\log{n})

MPC rounds and

O(n/poly\log{(n)})

memory per machine. These results settle multiple open questions posed in the recent paper of Czumaj~et.al. [STOC 2018]

arXiv.org e-Print Archive

Crossref

Non-uniform Sampling in Clustering and Streaming

Author: Monemizadeh Morteza
Publication venue
Publication date
Field of study

Eldorado - Ressourcen aus und für Lehre, Studium und Forschung

On Constructing Spanners from Random Gaussian Projections

Author: Assadi Sepehr
Kapralov Michael
Yu Huacheng
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2023)
Publication date: 01/01/2023
Field of study

Graph sketching is a powerful paradigm for analyzing graph structure via linear measurements introduced by Ahn, Guha, and McGregor (SODA\u2712) that has since found numerous applications in streaming, distributed computing, and massively parallel algorithms, among others. Graph sketching has proven to be quite successful for various problems such as connectivity, minimum spanning trees, edge or vertex connectivity, and cut or spectral sparsifiers. Yet, the problem of approximating shortest path metric of a graph, and specifically computing a spanner, is notably missing from the list of successes. This has turned the status of this fundamental problem into one of the most longstanding open questions in this area. We present a partial explanation of this lack of success by proving a strong lower bound for a large family of graph sketching algorithms that encompasses prior work on spanners and many (but importantly not also all) related cut-based problems mentioned above. Our lower bound matches the algorithmic bounds of the recent result of Filtser, Kapralov, and Nouri (SODA\u2721), up to lower order terms, for constructing spanners via the same graph sketching family. This establishes near-optimality of these bounds, at least restricted to this family of graph sketching techniques, and makes progress on a conjecture posed in this latter work

Dagstuhl Research Online Publication Server