Search CORE

6 research outputs found

A Detailed Analysis of the SpaceSaving $\pm$ Family of Algorithms with Bounded Deletions

Author: Abbadi Amr El
Agrawal Divyakant
de Rougemont Michel
Mathieu Claire
Metwally Ahmed
Zhao Fuheng
Publication venue
Publication date: 22/09/2023
Field of study

In this paper, we present an advanced analysis of near optimal deterministic algorithms using a small space budget to solve the frequency estimation, heavy hitters, frequent items, and top-k approximation in the bounded deletion model. We define the family of SpaceSaving

\pm

algorithms and explain why the original SpaceSaving

\pm

algorithm only works when insertions and deletions are not interleaved. Next, we introduce the new DoubleSpaceSaving

\pm

and the IntegratedSpaceSaving

\pm

and prove their correctness. They show similar characteristics and both extend the popular space-efficient SpaceSaving algorithm. However, these two algorithms represent different trade-offs, in which DoubleSpaceSaving

\pm

distributes the operations to two independent summaries while Integrated-SpaceSaving

\pm

fully synchronizes deletions with insertions. Since data streams are often skewed, we present an improved analysis of these two algorithms and show that errors do not depend on the hot items and are only dependent on the cold and warm items. We also demonstrate how to achieve the relative error guarantee under mild assumptions. Moreover, we establish that the important mergeability property exists on these two algorithms which is desirable in distributed settings

arXiv.org e-Print Archive

A Framework for Adversarially Robust Streaming Algorithms

Author: Ben-Eliezer O.
Błasiok J.
Clifford P.
Ganguly S.
Jayaram R.
Li Y.
McCurley K. S.
Naor M.
Publication venue
Publication date: 03/11/2021
Field of study

We investigate the adversarial robustness of streaming algorithms. In this context, an algorithm is considered robust if its performance guarantees hold even if the stream is chosen adaptively by an adversary that observes the outputs of the algorithm along the stream and can react in an online manner. While deterministic streaming algorithms are inherently robust, many central problems in the streaming literature do not admit sublinear-space deterministic algorithms; on the other hand, classical space-efficient randomized algorithms for these problems are generally not adversarially robust. This raises the natural question of whether there exist efficient adversarially robust (randomized) streaming algorithms for these problems. In this work, we show that the answer is positive for various important streaming problems in the insertion-only model, including distinct elements and more generally

F_p

-estimation,

F_p

-heavy hitters, entropy estimation, and others. For all of these problems, we develop adversarially robust

(1+\varepsilon)

-approximation algorithms whose required space matches that of the best known non-robust algorithms up to a

\text{poly}(\log n, 1/\varepsilon)

multiplicative factor (and in some cases even up to a constant factor). Towards this end, we develop several generic tools allowing one to efficiently transform a non-robust streaming algorithm into a robust one in various scenarios.Comment: Conference version in PODS 2020. Version 3 addressing journal referees' comments; improved exposition of sketch switchin

arXiv.org e-Print Archive

Crossref

A Simple Proof of a New Set Disjointness with Applications to Data Streams

Author: Kamath Akshay
Price Eric
Woodruff David P.
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 36th Computational Complexity Conference (CCC 2021)
Publication date: 01/01/2021
Field of study

Dagstuhl Research Online Publication Server

Streaming Algorithms with Large Approximation Factors

Author: Li Yi
Lin Honghao
Woodruff David P.
Zhang Yuheng
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2022)
Publication date: 01/01/2022
Field of study

Dagstuhl Research Online Publication Server

The White-Box Adversarial Data Stream Model

Author: Ajtai Miklos
Braverman Vladimir
Jayram T. S.
Silwal Sandeep
Sun Alec
Woodruff David P.
Zhou Samson
Publication venue
Publication date: 19/04/2022
Field of study

We study streaming algorithms in the white-box adversarial model, where the stream is chosen adaptively by an adversary who observes the entire internal state of the algorithm at each time step. We show that nontrivial algorithms are still possible. We first give a randomized algorithm for the

L_1

-heavy hitters problem that outperforms the optimal deterministic Misra-Gries algorithm on long streams. If the white-box adversary is computationally bounded, we use cryptographic techniques to reduce the memory of our

L_1

-heavy hitters algorithm even further and to design a number of additional algorithms for graph, string, and linear algebra problems. The existence of such algorithms is surprising, as the streaming algorithm does not even have a secret key in this model, i.e., its state is entirely known to the adversary. One algorithm we design is for estimating the number of distinct elements in a stream with insertions and deletions achieving a multiplicative approximation and sublinear space; such an algorithm is impossible for deterministic algorithms. We also give a general technique that translates any two-player deterministic communication lower bound to a lower bound for {\it randomized} algorithms robust to a white-box adversary. In particular, our results show that for all

p\ge 0

, there exists a constant

C_p>1

such that any

C_p

-approximation algorithm for

F_p

moment estimation in insertion-only streams with a white-box adversary requires

\Omega(n)

space for a universe of size

n

. Similarly, there is a constant

C>1

such that any

C

-approximation algorithm in an insertion-only stream for matrix rank requires

\Omega(n)

space with a white-box adversary. Our algorithmic results based on cryptography thus show a separation between computationally bounded and unbounded adversaries. (Abstract shortened to meet arXiv limits.)Comment: PODS 202

arXiv.org e-Print Archive

DSpace@MIT