1,975 research outputs found

    Block Edit Errors with Transpositions: Deterministic Document Exchange Protocols and Almost Optimal Binary Codes

    Get PDF
    Document exchange and error correcting codes are two fundamental problems regarding communications. In the first problem, Alice and Bob each holds a string, and the goal is for Alice to send a short sketch to Bob, so that Bob can recover Alice\u27s string. In the second problem, Alice sends a message with some redundant information to Bob through a channel that can add adversarial errors, and the goal is for Bob to correctly recover the message despite the errors. In both problems, an upper bound is placed on the number of errors between the two strings or that the channel can add, and a major goal is to minimize the size of the sketch or the redundant information. In this paper we focus on deterministic document exchange protocols and binary error correcting codes. Both problems have been studied extensively. In the case of Hamming errors (i.e., bit substitutions) and bit erasures, we have explicit constructions with asymptotically optimal parameters. However, other error types are still rather poorly understood. In a recent work [Kuan Cheng et al., 2018], the authors constructed explicit deterministic document exchange protocols and binary error correcting codes for edit errors with almost optimal parameters. Unfortunately, the constructions in [Kuan Cheng et al., 2018] do not work for other common errors such as block transpositions. In this paper, we generalize the constructions in [Kuan Cheng et al., 2018] to handle a much larger class of errors. These include bursts of insertions and deletions, as well as block transpositions. Specifically, we consider document exchange and error correcting codes where the total number of block insertions, block deletions, and block transpositions is at most k <= alpha n/log n for some constant 0<alpha<1. In addition, the total number of bits inserted and deleted by the first two kinds of operations is at most t <= beta n for some constant 0<beta<1, where n is the length of Alice\u27s string or message. We construct explicit, deterministic document exchange protocols with sketch size O((k log n +t) log^2 n/{k log n + t}) and explicit binary error correcting code with O(k log n log log log n+t) redundant bits. As a comparison, the information-theoretic optimum for both problems is Theta(k log n+t). As far as we know, previously there are no known explicit deterministic document exchange protocols in this case, and the best known binary code needs Omega(n) redundant bits even to correct just one block transposition [L. J. Schulman and D. Zuckerman, 1999]

    Locally Decodable Codes with Randomized Encoding

    Get PDF
    We initiate a study of locally decodable codes with randomized encoding. Standard locally decodable codes are error correcting codes with a deterministic encoding function and a randomized decoding function, such that any desired message bit can be recovered with good probability by querying only a small number of positions in the corrupted codeword. This allows one to recover any message bit very efficiently in sub-linear or even logarithmic time. Besides this straightforward application, locally decodable codes have also found many other applications such as private information retrieval, secure multiparty computation, and average-case complexity. However, despite extensive research, the tradeoff between the rate of the code and the number of queries is somewhat disappointing. For example, the best known constructions still need super-polynomially long codeword length even with a logarithmic number of queries, and need a polynomial number of queries to achieve a constant rate. In this paper, we show that by using a randomized encoding, in several models we can achieve significantly better rate-query tradeoff. In addition, our codes work for both the standard Hamming errors, and the more general and harder edit errors.Comment: 23 page

    Edit Distance: Sketching, Streaming and Document Exchange

    Full text link
    We show that in the document exchange problem, where Alice holds x{0,1}nx \in \{0,1\}^n and Bob holds y{0,1}ny \in \{0,1\}^n, Alice can send Bob a message of size O(K(log2K+logn))O(K(\log^2 K+\log n)) bits such that Bob can recover xx using the message and his input yy if the edit distance between xx and yy is no more than KK, and output "error" otherwise. Both the encoding and decoding can be done in time O~(n+poly(K))\tilde{O}(n+\mathsf{poly}(K)). This result significantly improves the previous communication bounds under polynomial encoding/decoding time. We also show that in the referee model, where Alice and Bob hold xx and yy respectively, they can compute sketches of xx and yy of sizes poly(Klogn)\mathsf{poly}(K \log n) bits (the encoding), and send to the referee, who can then compute the edit distance between xx and yy together with all the edit operations if the edit distance is no more than KK, and output "error" otherwise (the decoding). To the best of our knowledge, this is the first result for sketching edit distance using poly(Klogn)\mathsf{poly}(K \log n) bits. Moreover, the encoding phase of our sketching algorithm can be performed by scanning the input string in one pass. Thus our sketching algorithm also implies the first streaming algorithm for computing edit distance and all the edits exactly using poly(Klogn)\mathsf{poly}(K \log n) bits of space.Comment: Full version of an article to be presented at the 57th Annual IEEE Symposium on Foundations of Computer Science (FOCS 2016

    Efficient Linear and Affine Codes for Correcting Insertions/Deletions

    Full text link
    This paper studies \emph{linear} and \emph{affine} error-correcting codes for correcting synchronization errors such as insertions and deletions. We call such codes linear/affine insdel codes. Linear codes that can correct even a single deletion are limited to have information rate at most 1/21/2 (achieved by the trivial 2-fold repetition code). Previously, it was (erroneously) reported that more generally no non-trivial linear codes correcting kk deletions exist, i.e., that the (k+1)(k+1)-fold repetition codes and its rate of 1/(k+1)1/(k+1) are basically optimal for any kk. We disprove this and show the existence of binary linear codes of length nn and rate just below 1/21/2 capable of correcting Ω(n)\Omega(n) insertions and deletions. This identifies rate 1/21/2 as a sharp threshold for recovery from deletions for linear codes, and reopens the quest for a better understanding of the capabilities of linear codes for correcting insertions/deletions. We prove novel outer bounds and existential inner bounds for the rate vs. (edit) distance trade-off of linear insdel codes. We complement our existential results with an efficient synchronization-string-based transformation that converts any asymptotically-good linear code for Hamming errors into an asymptotically-good linear code for insdel errors. Lastly, we show that the 12\frac{1}{2}-rate limitation does not hold for affine codes by giving an explicit affine code of rate 1ϵ1-\epsilon which can efficiently correct a constant fraction of insdel errors

    Linear Insertion Deletion Codes in the High-Noise and High-Rate Regimes

    Get PDF
    This work continues the study of linear error correcting codes against adversarial insertion deletion errors (insdel errors). Previously, the work of Cheng, Guruswami, Haeupler, and Li [Kuan Cheng et al., 2021] showed the existence of asymptotically good linear insdel codes that can correct arbitrarily close to 1 fraction of errors over some constant size alphabet, or achieve rate arbitrarily close to 1/2 even over the binary alphabet. As shown in [Kuan Cheng et al., 2021], these bounds are also the best possible. However, known explicit constructions in [Kuan Cheng et al., 2021], and subsequent improved constructions by Con, Shpilka, and Tamo [Con et al., 2022] all fall short of meeting these bounds. Over any constant size alphabet, they can only achieve rate < 1/8 or correct < 1/4 fraction of errors; over the binary alphabet, they can only achieve rate < 1/1216 or correct < 1/54 fraction of errors. Apparently, previous techniques face inherent barriers to achieve rate better than 1/4 or correct more than 1/2 fraction of errors. In this work we give new constructions of such codes that meet these bounds, namely, asymptotically good linear insdel codes that can correct arbitrarily close to 1 fraction of errors over some constant size alphabet, and binary asymptotically good linear insdel codes that can achieve rate arbitrarily close to 1/2. All our constructions are efficiently encodable and decodable. Our constructions are based on a novel approach of code concatenation, which embeds the index information implicitly into codewords. This significantly differs from previous techniques and may be of independent interest. Finally, we also prove the existence of linear concatenated insdel codes with parameters that match random linear codes, and propose a conjecture about linear insdel codes

    Pseudorandom Constructions: Computing in Parallel and Applications to Edit Distance Codes

    Get PDF
    The thesis focuses on two problems about pseudorandom constructions. The first problem is how to compute pseudorandom constructions by constant depth circuits. Pseudorandom constructions are deterministic functions which are used to substitute random constructions in various computational tasks. Constant depth circuits here refer to the computation model which can compute functions using circuits of \AND, \OR and negation gates, with constant depth, unbounded fan-in, taking function inputs by input wires and giving function outputs by output wires. They can be simulated by fast parallel algorithms. We study such constructions mainly for randomness extractors, secret sharing schemes and their applications. Randomness extractors are functions which transform biased random bits to uniform ones. They can be used to recycle random bits in computations if there are some entropies remaining. Secret sharing schemes efficiently share secrets among multi-parties s.t. the collusion of a bounded number of parties cannot recover any information of the secret while a certain larger number of parties can recover the secret. Our work constructs these objects with near optimal parameters and explores their applications. The second problem is about applying pseudorandom constructions to build error correcting codes (ECCs) for edit distance. ECCs project messages to codewords in a metric space s.t. one can recover the codewords even if there are bounded number of errors which can drive the codeword away by some bounded distance. They are widely used in both the theoretical and practical part of computer science. Classic errors are hamming errors which are substitutions and erasures of symbols. They are well studied by numerous literatures before. We consider one kind of more general errors i.e. edit errors, consists of insertions and deletions that may change the positions of symbols. Our work give explicit constructions of binary ECCs for edit errors with redundancy length near optimal. The constructions utilize document exchange protocols which can let two party synchronize their strings with bounded edit distance, by letting one party send a short sketch of its string to the other. We apply various pseudorandom constructions to get deterministic document exchange protocols from randomized ones. Then we construct ECCs using them. We also extend these constructions to handle block insertions/deletions and transpositions. All these constructions have near optimal parameters
    corecore