900 research outputs found

    An Improved Interactive Streaming Algorithm for the Distinct Elements Problem

    Full text link
    The exact computation of the number of distinct elements (frequency moment F0F_0) is a fundamental problem in the study of data streaming algorithms. We denote the length of the stream by nn where each symbol is drawn from a universe of size mm. While it is well known that the moments F0,F1,F2F_0,F_1,F_2 can be approximated by efficient streaming algorithms, it is easy to see that exact computation of F0,F2F_0,F_2 requires space Ω(m)\Omega(m). In previous work, Cormode et al. therefore considered a model where the data stream is also processed by a powerful helper, who provides an interactive proof of the result. They gave such protocols with a polylogarithmic number of rounds of communication between helper and verifier for all functions in NC. This number of rounds (O(log2m)  in the case of  F0)\left(O(\log^2 m) \;\text{in the case of} \;F_0 \right) can quickly make such protocols impractical. Cormode et al. also gave a protocol with logm+1\log m +1 rounds for the exact computation of F0F_0 where the space complexity is O(logmlogn+log2m)O\left(\log m \log n+\log^2 m\right) but the total communication O(nlogm(logn+logm))O\left(\sqrt{n}\log m\left(\log n+ \log m \right)\right). They managed to give logm\log m round protocols with polylog(m,n)\operatorname{polylog}(m,n) complexity for many other interesting problems including F2F_2, Inner product, and Range-sum, but computing F0F_0 exactly with polylogarithmic space and communication and O(logm)O(\log m) rounds remained open. In this work, we give a streaming interactive protocol with logm\log m rounds for exact computation of F0F_0 using O(logm(logn+logmloglogm))O\left(\log m \left(\,\log n + \log m \log\log m\,\right)\right) bits of space and the communication is O(logm(logn+log3m(loglogm)2))O\left( \log m \left(\,\log n +\log^3 m (\log\log m)^2 \,\right)\right). The update time of the verifier per symbol received is O(log2m)O(\log^2 m).Comment: Submitted to ICALP 201

    Semi-Streaming Algorithms for Annotated Graph Streams

    Get PDF
    Considerable effort has been devoted to the development of streaming algorithms for analyzing massive graphs. Unfortunately, many results have been negative, establishing that a wide variety of problems require Ω(n2)\Omega(n^2) space to solve. One of the few bright spots has been the development of semi-streaming algorithms for a handful of graph problems -- these algorithms use space O(npolylog(n))O(n\cdot\text{polylog}(n)). In the annotated data streaming model of Chakrabarti et al., a computationally limited client wants to compute some property of a massive input, but lacks the resources to store even a small fraction of the input, and hence cannot perform the desired computation locally. The client therefore accesses a powerful but untrusted service provider, who not only performs the requested computation, but also proves that the answer is correct. We put forth the notion of semi-streaming algorithms for annotated graph streams (semi-streaming annotation schemes for short). These are protocols in which both the client's space usage and the length of the proof are O(npolylog(n))O(n \cdot \text{polylog}(n)). We give evidence that semi-streaming annotation schemes represent a substantially more robust solution concept than does the standard semi-streaming model. On the positive side, we give semi-streaming annotation schemes for two dynamic graph problems that are intractable in the standard model: (exactly) counting triangles, and (exactly) computing maximum matchings. The former scheme answers a question of Cormode. On the negative side, we identify for the first time two natural graph problems (connectivity and bipartiteness in a certain edge update model) that can be solved in the standard semi-streaming model, but cannot be solved by annotation schemes of "sub-semi-streaming" cost. That is, these problems are just as hard in the annotations model as they are in the standard model.Comment: This update includes some additional discussion of the results proven. The result on counting triangles was previously included in an ECCC technical report by Chakrabarti et al. available at http://eccc.hpi-web.de/report/2013/180/. That report has been superseded by this manuscript, and the CCC 2015 paper "Verifiable Stream Computation and Arthur-Merlin Communication" by Chakrabarti et a

    Streaming Verification of Graph Computations via Graph Structure

    Get PDF
    We give new algorithms in the annotated data streaming setting - also known as verifiable data stream computation - for certain graph problems. This setting is meant to model outsourced computation, where a space-bounded verifier limited to sequential data access seeks to overcome its computational limitations by engaging a powerful prover, without needing to trust the prover. As is well established, several problems that admit no sublinear-space algorithms under traditional streaming do allow protocols using a sublinear amount of prover/verifier communication and sublinear-space verification. We give algorithms for many well-studied graph problems including triangle counting, its generalization to subgraph counting, maximum matching, problems about the existence (or not) of short paths, finding the shortest path between two vertices, and testing for an independent set. While some of these problems have been studied before, our results achieve new tradeoffs between space and communication costs that were hitherto unknown. In particular, two of our results disprove explicit conjectures of Thaler (ICALP, 2016) by giving triangle counting and maximum matching algorithms for n-vertex graphs, using o(n) space and o(n^2) communication

    Small Circuits Imply Efficient Arthur-Merlin Protocols

    Get PDF
    The inner product function ? x,y ? = ?_i x_i y_i mod 2 can be easily computed by a (linear-size) AC?(?) circuit: that is, a constant depth circuit with AND, OR and parity (XOR) gates. But what if we impose the restriction that the parity gates can only be on the bottom most layer (closest to the input)? Namely, can the inner product function be computed by an AC? circuit composed with a single layer of parity gates? This seemingly simple question is an important open question at the frontier of circuit lower bound research. In this work, we focus on a minimalistic version of the above question. Namely, whether the inner product function cannot be approximated by a small DNF augmented with a single layer of parity gates. Our main result shows that the existence of such a circuit would have unexpected implications for interactive proofs, or more specifically, for interactive variants of the Data Streaming and Communication Complexity models. In particular, we show that the existence of such a small (i.e., polynomial-size) circuit yields: 1) An O(d)-message protocol in the Arthur-Merlin Data Streaming model for every n-variate, degree d polynomial (over GF(2)), using only O?(d) ?log(n) communication and space complexity. In particular, this gives an AM[2] Data Streaming protocol for a variant of the well-studied triangle counting problem, with poly-logarithmic communication and space complexities. 2) A 2-message communication complexity protocol for any sparse (or low degree) polynomial, and for any function computable by an AC?(?) circuit. Specifically, for the latter, we obtain a protocol with communication complexity that is poly-logarithmic in the size of the AC?(?) circuit

    Annotations for Sparse Data Streams

    Full text link
    Motivated by cloud computing, a number of recent works have studied annotated data streams and variants thereof. In this setting, a computationally weak verifier (cloud user), lacking the resources to store and manipulate his massive input locally, accesses a powerful but untrusted prover (cloud service). The verifier must work within the restrictive data streaming paradigm. The prover, who can annotate the data stream as it is read, must not just supply the answer but also convince the verifier of its correctness. Ideally, both the amount of annotation and the space used by the verifier should be sublinear in the relevant input size parameters. A rich theory of such algorithms -- which we call schemes -- has emerged. Prior work has shown how to leverage the prover's power to efficiently solve problems that have no non-trivial standard data stream algorithms. However, while optimal schemes are now known for several basic problems, such optimality holds only for streams whose length is commensurate with the size of the data universe. In contrast, many real-world datasets are relatively sparse, including graphs that contain only O(n^2) edges, and IP traffic streams that contain much fewer than the total number of possible IP addresses, 2^128 in IPv6. We design the first schemes that allow both the annotation and the space usage to be sublinear in the total number of stream updates rather than the size of the data universe. We solve significant problems, including variations of INDEX, SET-DISJOINTNESS, and FREQUENCY-MOMENTS, plus several natural problems on graphs. On the other hand, we give a new lower bound that, for the first time, rules out smooth tradeoffs between annotation and space usage for a specific problem. Our technique brings out new nuances in Merlin-Arthur communication complexity models, and provides a separation between online versions of the MA and AMA models.Comment: 29 pages, 5 table

    An exponential separation between MA and AM proofs of proximity

    Get PDF
    Interactive proofs of proximity allow a sublinear-time verifier to check that a given input is close to the language, using a small amount of communication with a powerful (but untrusted) prover. In this work we consider two natural minimally interactive variants of such proofs systems, in which the prover only sends a single message, referred to as the proof. The first variant, known as MA-proofs of Proximity (MAP), is fully non-interactive, meaning that the proof is a function of the input only. The second variant, known as AM-proofs of Proximity (AMP), allows the proof to additionally depend on the verifier's (entire) random string. The complexity of both MAPs and AMPs is the total number of bits that the verifier observes - namely, the sum of the proof length and query complexity. Our main result is an exponential separation between the power of MAPs and AMPs. Specifically, we exhibit an explicit and natural property Pi that admits an AMP with complexity O(log n), whereas any MAP for Pi has complexity Omega~(n^{1/4}), where n denotes the length of the input in bits. Our MAP lower bound also yields an alternate proof, which is more general and arguably much simpler, for a recent result of Fischer et al. (ITCS, 2014). Lastly, we also consider the notion of oblivious proofs of proximity, in which the verifier's queries are oblivious to the proof. In this setting we show that AMPs can only be quadratically stronger than MAPs. As an application of this result, we show an exponential separation between the power of public and private coin for oblivious interactive proofs of proximity

    New Verification Schemes for Frequency-Based Functions on Data Streams

    Get PDF
    We study the general problem of computing frequency-based functions, i.e., the sum of any given function of data stream frequencies. Special cases include fundamental data stream problems such as computing the number of distinct elements (F0F_0), frequency moments (FkF_k), and heavy-hitters. It can also be applied to calculate the maximum frequency of an element (FF_{\infty}). Given that exact computation of most of these special cases provably do not admit any sublinear space algorithm, a natural approach is to consider them in an enhanced data streaming model, where we have a computationally unbounded but untrusted prover sending proofs or help messages to ease the computation. Think of a memory-restricted client delegating the computation to a stronger cloud service whom it doesn't want to trust blindly. Using its limited memory, it wants to verify the proof that the cloud sends. Chakrabarti et al.~(ICALP '09) introduced this setting as the "annotated data streaming model" and showed that multiple problems including exact computation of frequency-based functions---that have no sublinear algorithms in basic streaming---do have annotated streaming algorithms, also called "schemes", with both space and proof-length sublinear in the input size. We give a general scheme for computing any frequency-based function with both space usage and proof-size of O(n2/3logn)O(n^{2/3}\log n) bits, where nn is the size of the universe. This improves upon the best known bound of O(n2/3log4/3n)O(n^{2/3}\log^{4/3} n) given by the seminal paper of Chakrabarti et al.~and as a result, also improves upon the best known bounds for the important special cases of computing F0F_0 and FF_{\infty}. We emphasize that while being quantitatively better, our scheme is also qualitatively better in the sense that it is simpler than the previously best scheme that uses intricate data structures and elaborate subroutines.Comment: To appear in FSTTCS 202

    Quantum Space, Ground Space Traversal, and How to Embed Multi-Prover Interactive Proofs into Unentanglement

    Get PDF
    Savitch's theorem states that NPSPACE computations can be simulated in PSPACE. We initiate the study of a quantum analogue of NPSPACE, denoted Streaming-QCMASPACE (SQCMASPACE), where an exponentially long classical proof is streamed to a poly-space quantum verifier. Besides two main results, we also show that a quantum analogue of Savitch's theorem is unlikely to hold, as SQCMASPACE=NEXP. For completeness, we introduce Streaming-QMASPACE (SQMASPACE) with an exponentially long streamed quantum proof, and show SQMASPACE=QMA_EXP (quantum analogue of NEXP). Our first main result shows, in contrast to the classical setting, the solution space of a quantum constraint satisfaction problem (i.e. a local Hamiltonian) is always connected when exponentially long proofs are permitted. For this, we show how to simulate any Lipschitz continuous path on the unit hypersphere via a sequence of local unitary gates, at the expense of blowing up the circuit size. This shows quantum error-correcting codes can be unable to detect one codeword erroneously evolving to another if the evolution happens sufficiently slowly, and answers an open question of [Gharibian, Sikora, ICALP 2015] regarding the Ground State Connectivity problem. Our second main result is that any SQCMASPACE computation can be embedded into "unentanglement", i.e. into a quantum constraint satisfaction problem with unentangled provers. Formally, we show how to embed SQCMASPACE into the Sparse Separable Hamiltonian problem of [Chailloux, Sattath, CCC 2012] (QMA(2)-complete for 1/poly promise gap), at the expense of scaling the promise gap with the streamed proof size. As a corollary, we obtain the first systematic construction for obtaining QMA(2)-type upper bounds on arbitrary multi-prover interactive proof systems, where the QMA(2) promise gap scales exponentially with the number of bits of communication in the interactive proof.Comment: 60 pages, 4 figure
    corecore