3,833 research outputs found

    Efficient and Error-Correcting Data Structures for Membership and Polynomial Evaluation

    Get PDF
    We construct efficient data structures that are resilient against a constant fraction of adversarial noise. Our model requires that the decoder answers most queries correctly with high probability and for the remaining queries, the decoder with high probability either answers correctly or declares "don't know." Furthermore, if there is no noise on the data structure, it answers all queries correctly with high probability. Our model is the common generalization of a model proposed recently by de Wolf and the notion of "relaxed locally decodable codes" developed in the PCP literature. We measure the efficiency of a data structure in terms of its length, measured by the number of bits in its representation, and query-answering time, measured by the number of bit-probes to the (possibly corrupted) representation. In this work, we study two data structure problems: membership and polynomial evaluation. We show that these two problems have constructions that are simultaneously efficient and error-correcting.Comment: An abridged version of this paper appears in STACS 201

    Efficient and error-correcting data structures for membership and polynomial evaluation

    Get PDF

    Error-Correcting Data Structures

    Get PDF
    We study data structures in the presence of adversarial noise. We want to encode a given object in a succinct data structure that enables us to efficiently answer specific queries about the object, even if the data structure has been corrupted by a constant fraction of errors. This new model is the common generalization of (static) data structures and locally decodable error-correcting codes. The main issue is the tradeoff between the space used by the data structure and the time (number of probes) needed to answer a query about the encoded object. We prove a number of upper and lower bounds on various natural error-correcting data structure problems. In particular, we show that the optimal length of error-correcting data structures for the Membership problem (where we want to store subsets of size s from a universe of size n) is closely related to the optimal length of locally decodable codes for s-bit strings.Comment: 15 pages LaTeX; an abridged version will appear in the Proceedings of the STACS 2009 conferenc

    The Cell Probe Complexity of Succinct Data Structures

    Get PDF
    In the cell probe model with word size 1 (the bit probe model), a static data structure problem is given by a map f:0,1nimes0,1mightarrow0,1f: {0,1}^n imes {0,1}^m ightarrow {0,1}, where 0,1n{0,1}^n is a set of possible data to be stored, 0,1m{0,1}^m is a set of possible queries (for natural problems, we have mllnm ll n) and f(x,y)f(x,y) is the answer to question yy about data xx. A solution is given by a representation phi:0,1nightarrow0,1sphi: {0,1}^n ightarrow {0,1}^s and a query algorithm qq so that q(phi(x),y)=f(x,y)q(phi(x), y) = f(x,y). The time tt of the query algorithm is the number of bits it reads in phi(x)phi(x). In this paper, we consider the case of {em succinct} representations where s=n+rs = n + r for some {em redundancy} rllnr ll n. For a boolean version of the problem of polynomial evaluation with preprocessing of coefficients, we show a lower bound on the redundancy-query time tradeoff of the form [ (r+1) t geq Omega(n/log n).] In particular, for very small redundancies rr, we get an almost optimal lower bound stating that the query algorithm has to inspect almost the entire data structure (up to a logarithmic factor). We show similar lower bounds for problems satisfying a certain combinatorial property of a coding theoretic flavor. Previously, no omega(m)omega(m) lower bounds were known on tt in the general model for explicit functions, even for very small redundancies. By restricting our attention to {em systematic} or {em index} structures phiphi satisfying phi(x)=xcdotphi(x)phi(x) = x cdot phi^*(x) for some map phiphi^* (where cdotcdot denotes concatenation) we show similar lower bounds on the redundancy-query time tradeoff for the natural data structuring problems of Prefix Sum and Substring Search

    Passive network tomography for erroneous networks: A network coding approach

    Full text link
    Passive network tomography uses end-to-end observations of network communication to characterize the network, for instance to estimate the network topology and to localize random or adversarial glitches. Under the setting of linear network coding this work provides a comprehensive study of passive network tomography in the presence of network (random or adversarial) glitches. To be concrete, this work is developed along two directions: 1. Tomographic upper and lower bounds (i.e., the most adverse conditions in each problem setting under which network tomography is possible, and corresponding schemes (computationally efficient, if possible) that achieve this performance) are presented for random linear network coding (RLNC). We consider RLNC designed with common randomness, i.e., the receiver knows the random code-books all nodes. (To justify this, we show an upper bound for the problem of topology estimation in networks using RLNC without common randomness.) In this setting we present the first set of algorithms that characterize the network topology exactly. Our algorithm for topology estimation with random network errors has time complexity that is polynomial in network parameters. For the problem of network error localization given the topology information, we present the first computationally tractable algorithm to localize random errors, and prove it is computationally intractable to localize adversarial errors. 2. New network coding schemes are designed that improve the tomographic performance of RLNC while maintaining the desirable low-complexity, throughput-optimal, distributed linear network coding properties of RLNC. In particular, we design network codes based on Reed-Solomon codes so that a maximal number of adversarial errors can be localized in a computationally efficient manner even without the information of network topology.Comment: 40 pages, under submission for IEEE Trans. on Information Theor

    Distributed PCP Theorems for Hardness of Approximation in P

    Get PDF
    We present a new distributed model of probabilistically checkable proofs (PCP). A satisfying assignment x{0,1}nx \in \{0,1\}^n to a CNF formula φ\varphi is shared between two parties, where Alice knows x1,,xn/2x_1, \dots, x_{n/2}, Bob knows xn/2+1,,xnx_{n/2+1},\dots,x_n, and both parties know φ\varphi. The goal is to have Alice and Bob jointly write a PCP that xx satisfies φ\varphi, while exchanging little or no information. Unfortunately, this model as-is does not allow for nontrivial query complexity. Instead, we focus on a non-deterministic variant, where the players are helped by Merlin, a third party who knows all of xx. Using our framework, we obtain, for the first time, PCP-like reductions from the Strong Exponential Time Hypothesis (SETH) to approximation problems in P. In particular, under SETH we show that there are no truly-subquadratic approximation algorithms for Bichromatic Maximum Inner Product over {0,1}-vectors, Bichromatic LCS Closest Pair over permutations, Approximate Regular Expression Matching, and Diameter in Product Metric. All our inapproximability factors are nearly-tight. In particular, for the first two problems we obtain nearly-polynomial factors of 2(logn)1o(1)2^{(\log n)^{1-o(1)}}; only (1+o(1))(1+o(1))-factor lower bounds (under SETH) were known before

    The streaming kk-mismatch problem

    Get PDF
    We consider the streaming complexity of a fundamental task in approximate pattern matching: the kk-mismatch problem. It asks to compute Hamming distances between a pattern of length nn and all length-nn substrings of a text for which the Hamming distance does not exceed a given threshold kk. In our problem formulation, we report not only the Hamming distance but also, on demand, the full \emph{mismatch information}, that is the list of mismatched pairs of symbols and their indices. The twin challenges of streaming pattern matching derive from the need both to achieve small working space and also to guarantee that every arriving input symbol is processed quickly. We present a streaming algorithm for the kk-mismatch problem which uses O(klognlognk)O(k\log{n}\log\frac{n}{k}) bits of space and spends \ourcomplexity time on each symbol of the input stream, which consists of the pattern followed by the text. The running time almost matches the classic offline solution and the space usage is within a logarithmic factor of optimal. Our new algorithm therefore effectively resolves and also extends an open problem first posed in FOCS'09. En route to this solution, we also give a deterministic O(k(lognk+logΣ))O( k (\log \frac{n}{k} + \log |\Sigma|) )-bit encoding of all the alignments with Hamming distance at most kk of a length-nn pattern within a text of length O(n)O(n). This secondary result provides an optimal solution to a natural communication complexity problem which may be of independent interest.Comment: 27 page
    corecore