1,975 research outputs found
Block Edit Errors with Transpositions: Deterministic Document Exchange Protocols and Almost Optimal Binary Codes
Document exchange and error correcting codes are two fundamental problems regarding communications. In the first problem, Alice and Bob each holds a string, and the goal is for Alice to send a short sketch to Bob, so that Bob can recover Alice\u27s string. In the second problem, Alice sends a message with some redundant information to Bob through a channel that can add adversarial errors, and the goal is for Bob to correctly recover the message despite the errors. In both problems, an upper bound is placed on the number of errors between the two strings or that the channel can add, and a major goal is to minimize the size of the sketch or the redundant information. In this paper we focus on deterministic document exchange protocols and binary error correcting codes.
Both problems have been studied extensively. In the case of Hamming errors (i.e., bit substitutions) and bit erasures, we have explicit constructions with asymptotically optimal parameters. However, other error types are still rather poorly understood. In a recent work [Kuan Cheng et al., 2018], the authors constructed explicit deterministic document exchange protocols and binary error correcting codes for edit errors with almost optimal parameters. Unfortunately, the constructions in [Kuan Cheng et al., 2018] do not work for other common errors such as block transpositions.
In this paper, we generalize the constructions in [Kuan Cheng et al., 2018] to handle a much larger class of errors. These include bursts of insertions and deletions, as well as block transpositions. Specifically, we consider document exchange and error correcting codes where the total number of block insertions, block deletions, and block transpositions is at most k <= alpha n/log n for some constant 0<alpha<1. In addition, the total number of bits inserted and deleted by the first two kinds of operations is at most t <= beta n for some constant 0<beta<1, where n is the length of Alice\u27s string or message. We construct explicit, deterministic document exchange protocols with sketch size O((k log n +t) log^2 n/{k log n + t}) and explicit binary error correcting code with O(k log n log log log n+t) redundant bits. As a comparison, the information-theoretic optimum for both problems is Theta(k log n+t). As far as we know, previously there are no known explicit deterministic document exchange protocols in this case, and the best known binary code needs Omega(n) redundant bits even to correct just one block transposition [L. J. Schulman and D. Zuckerman, 1999]
Locally Decodable Codes with Randomized Encoding
We initiate a study of locally decodable codes with randomized encoding.
Standard locally decodable codes are error correcting codes with a
deterministic encoding function and a randomized decoding function, such that
any desired message bit can be recovered with good probability by querying only
a small number of positions in the corrupted codeword. This allows one to
recover any message bit very efficiently in sub-linear or even logarithmic
time. Besides this straightforward application, locally decodable codes have
also found many other applications such as private information retrieval,
secure multiparty computation, and average-case complexity.
However, despite extensive research, the tradeoff between the rate of the
code and the number of queries is somewhat disappointing. For example, the best
known constructions still need super-polynomially long codeword length even
with a logarithmic number of queries, and need a polynomial number of queries
to achieve a constant rate. In this paper, we show that by using a randomized
encoding, in several models we can achieve significantly better rate-query
tradeoff. In addition, our codes work for both the standard Hamming errors, and
the more general and harder edit errors.Comment: 23 page
Edit Distance: Sketching, Streaming and Document Exchange
We show that in the document exchange problem, where Alice holds and Bob holds , Alice can send Bob a message of
size bits such that Bob can recover using the
message and his input if the edit distance between and is no more
than , and output "error" otherwise. Both the encoding and decoding can be
done in time . This result significantly
improves the previous communication bounds under polynomial encoding/decoding
time. We also show that in the referee model, where Alice and Bob hold and
respectively, they can compute sketches of and of sizes
bits (the encoding), and send to the referee, who can
then compute the edit distance between and together with all the edit
operations if the edit distance is no more than , and output "error"
otherwise (the decoding). To the best of our knowledge, this is the first
result for sketching edit distance using bits.
Moreover, the encoding phase of our sketching algorithm can be performed by
scanning the input string in one pass. Thus our sketching algorithm also
implies the first streaming algorithm for computing edit distance and all the
edits exactly using bits of space.Comment: Full version of an article to be presented at the 57th Annual IEEE
Symposium on Foundations of Computer Science (FOCS 2016
Efficient Linear and Affine Codes for Correcting Insertions/Deletions
This paper studies \emph{linear} and \emph{affine} error-correcting codes for
correcting synchronization errors such as insertions and deletions. We call
such codes linear/affine insdel codes.
Linear codes that can correct even a single deletion are limited to have
information rate at most (achieved by the trivial 2-fold repetition
code). Previously, it was (erroneously) reported that more generally no
non-trivial linear codes correcting deletions exist, i.e., that the
-fold repetition codes and its rate of are basically optimal
for any . We disprove this and show the existence of binary linear codes of
length and rate just below capable of correcting
insertions and deletions. This identifies rate as a sharp threshold for
recovery from deletions for linear codes, and reopens the quest for a better
understanding of the capabilities of linear codes for correcting
insertions/deletions.
We prove novel outer bounds and existential inner bounds for the rate vs.
(edit) distance trade-off of linear insdel codes. We complement our existential
results with an efficient synchronization-string-based transformation that
converts any asymptotically-good linear code for Hamming errors into an
asymptotically-good linear code for insdel errors. Lastly, we show that the
-rate limitation does not hold for affine codes by giving an
explicit affine code of rate which can efficiently correct a
constant fraction of insdel errors
Linear Insertion Deletion Codes in the High-Noise and High-Rate Regimes
This work continues the study of linear error correcting codes against adversarial insertion deletion errors (insdel errors). Previously, the work of Cheng, Guruswami, Haeupler, and Li [Kuan Cheng et al., 2021] showed the existence of asymptotically good linear insdel codes that can correct arbitrarily close to 1 fraction of errors over some constant size alphabet, or achieve rate arbitrarily close to 1/2 even over the binary alphabet. As shown in [Kuan Cheng et al., 2021], these bounds are also the best possible. However, known explicit constructions in [Kuan Cheng et al., 2021], and subsequent improved constructions by Con, Shpilka, and Tamo [Con et al., 2022] all fall short of meeting these bounds. Over any constant size alphabet, they can only achieve rate < 1/8 or correct < 1/4 fraction of errors; over the binary alphabet, they can only achieve rate < 1/1216 or correct < 1/54 fraction of errors. Apparently, previous techniques face inherent barriers to achieve rate better than 1/4 or correct more than 1/2 fraction of errors.
In this work we give new constructions of such codes that meet these bounds, namely, asymptotically good linear insdel codes that can correct arbitrarily close to 1 fraction of errors over some constant size alphabet, and binary asymptotically good linear insdel codes that can achieve rate arbitrarily close to 1/2. All our constructions are efficiently encodable and decodable. Our constructions are based on a novel approach of code concatenation, which embeds the index information implicitly into codewords. This significantly differs from previous techniques and may be of independent interest. Finally, we also prove the existence of linear concatenated insdel codes with parameters that match random linear codes, and propose a conjecture about linear insdel codes
Pseudorandom Constructions: Computing in Parallel and Applications to Edit Distance Codes
The thesis focuses on two problems about pseudorandom constructions.
The first problem is how to compute pseudorandom constructions by constant depth circuits. Pseudorandom constructions are deterministic functions which are used to substitute random constructions in various computational tasks. Constant depth circuits here refer to the computation model which can compute functions using circuits of \AND, \OR and negation gates, with constant depth, unbounded fan-in, taking function inputs by input wires and giving function outputs by output wires. They can be simulated by fast parallel algorithms. We study such constructions mainly for randomness extractors, secret sharing schemes and their applications. Randomness extractors are functions which transform biased random bits to uniform ones. They can be used to recycle random bits in computations if there are some entropies remaining. Secret sharing schemes efficiently share secrets among multi-parties s.t. the collusion of a bounded number of parties cannot recover any information of the secret while a certain larger number of parties can recover the secret. Our work constructs these objects with near optimal parameters and explores their applications.
The second problem is about applying pseudorandom constructions to build error correcting codes (ECCs) for edit distance. ECCs project messages to codewords in a metric space s.t. one can recover the codewords even if there are bounded number of errors which can drive the codeword away by some bounded distance. They are widely used in both the theoretical and practical part of computer science. Classic errors are hamming errors which are substitutions and erasures of symbols. They are well studied by numerous literatures before. We consider one kind of more general errors i.e. edit errors, consists of insertions and deletions that may change the positions of symbols. Our work give explicit constructions of binary ECCs for edit errors with redundancy length near optimal. The constructions utilize document exchange protocols which can let two party synchronize their strings with bounded edit distance, by letting one party send a short sketch of its string to the other. We apply various pseudorandom constructions to get deterministic document exchange protocols from randomized ones. Then we construct ECCs using them. We also extend these constructions to handle block insertions/deletions and transpositions. All these constructions have near optimal parameters
- …