323 research outputs found

    Time lower bounds for nonadaptive turnstile streaming algorithms

    Full text link
    We say a turnstile streaming algorithm is "non-adaptive" if, during updates, the memory cells written and read depend only on the index being updated and random coins tossed at the beginning of the stream (and not on the memory contents of the algorithm). Memory cells read during queries may be decided upon adaptively. All known turnstile streaming algorithms in the literature are non-adaptive. We prove the first non-trivial update time lower bounds for both randomized and deterministic turnstile streaming algorithms, which hold when the algorithms are non-adaptive. While there has been abundant success in proving space lower bounds, there have been no non-trivial update time lower bounds in the turnstile model. Our lower bounds hold against classically studied problems such as heavy hitters, point query, entropy estimation, and moment estimation. In some cases of deterministic algorithms, our lower bounds nearly match known upper bounds

    Private Data Stream Analysis for Universal Symmetric Norm Estimation

    Get PDF
    We study how to release summary statistics on a data stream subject to the constraint of differential privacy. In particular, we focus on releasing the family of symmetric norms, which are invariant under sign-flips and coordinate-wise permutations on an input data stream and include L_p norms, k-support norms, top-k norms, and the box norm as special cases. Although it may be possible to design and analyze a separate mechanism for each symmetric norm, we propose a general parametrizable framework that differentially privately releases a number of sufficient statistics from which the approximation of all symmetric norms can be simultaneously computed. Our framework partitions the coordinates of the underlying frequency vector into different levels based on their magnitude and releases approximate frequencies for the "heavy" coordinates in important levels and releases approximate level sizes for the "light" coordinates in important levels. Surprisingly, our mechanism allows for the release of an arbitrary number of symmetric norm approximations without any overhead or additional loss in privacy. Moreover, our mechanism permits (1+?)-approximation to each of the symmetric norms and can be implemented using sublinear space in the streaming model for many regimes of the accuracy and privacy parameters

    Private Data Stream Analysis for Universal Symmetric Norm Estimation

    Full text link
    We study how to release summary statistics on a data stream subject to the constraint of differential privacy. In particular, we focus on releasing the family of symmetric norms, which are invariant under sign-flips and coordinate-wise permutations on an input data stream and include LpL_p norms, kk-support norms, top-kk norms, and the box norm as special cases. Although it may be possible to design and analyze a separate mechanism for each symmetric norm, we propose a general parametrizable framework that differentially privately releases a number of sufficient statistics from which the approximation of all symmetric norms can be simultaneously computed. Our framework partitions the coordinates of the underlying frequency vector into different levels based on their magnitude and releases approximate frequencies for the "heavy" coordinates in important levels and releases approximate level sizes for the "light" coordinates in important levels. Surprisingly, our mechanism allows for the release of an arbitrary number of symmetric norm approximations without any overhead or additional loss in privacy. Moreover, our mechanism permits (1+α)(1+\alpha)-approximation to each of the symmetric norms and can be implemented using sublinear space in the streaming model for many regimes of the accuracy and privacy parameters

    Separations for Estimating Large Frequency Moments on Data Streams

    Get PDF
    We study the classical problem of moment estimation of an underlying vector whose nn coordinates are implicitly defined through a series of updates in a data stream. We show that if the updates to the vector arrive in the random-order insertion-only model, then there exist space efficient algorithms with improved dependencies on the approximation parameter ε\varepsilon. In particular, for any real p>2p > 2, we first obtain an algorithm for FpF_p moment estimation using O~(1ε4/pn12/p)\tilde{\mathcal{O}}\left(\frac{1}{\varepsilon^{4/p}}\cdot n^{1-2/p}\right) bits of memory. Our techniques also give algorithms for FpF_p moment estimation with p>2p>2 on arbitrary order insertion-only and turnstile streams, using O~(1ε4/pn12/p)\tilde{\mathcal{O}}\left(\frac{1}{\varepsilon^{4/p}}\cdot n^{1-2/p}\right) bits of space and two passes, which is the first optimal multi-pass FpF_p estimation algorithm up to logn\log n factors. Finally, we give an improved lower bound of Ω(1ε2n12/p)\Omega\left(\frac{1}{\varepsilon^2}\cdot n^{1-2/p}\right) for one-pass insertion-only streams. Our results separate the complexity of this problem both between random and non-random orders, as well as one-pass and multi-pass streams.Comment: ICALP 202

    Streaming Algorithms with Large Approximation Factors

    Get PDF
    corecore