Search CORE

156 research outputs found

Efficiently Decodable Codes for the Binary Deletion Channel

Author: Guruswami Venkatesan
Li Ray
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2017)
Publication date: 01/01/2017
Field of study

In the random deletion channel, each bit is deleted independently with probability p. For the random deletion channel, the existence of codes of rate (1-p)/9, and thus bounded away from 0 for any p 0

Dagstuhl Research Online Publication Server

Deletion codes in the high-noise and high-rate regimes

Author: Guruswami Venkatesan
Wang Carol
Publication venue
Publication date: 01/01/2014
Field of study

The noise model of deletions poses significant challenges in coding theory, with basic questions like the capacity of the binary deletion channel still being open. In this paper, we study the harder model of worst-case deletions, with a focus on constructing efficiently decodable codes for the two extreme regimes of high-noise and high-rate. Specifically, we construct polynomial-time decodable codes with the following trade-offs (for any eps > 0): (1) Codes that can correct a fraction 1-eps of deletions with rate poly(eps) over an alphabet of size poly(1/eps); (2) Binary codes of rate 1-O~(sqrt(eps)) that can correct a fraction eps of deletions; and (3) Binary codes that can be list decoded from a fraction (1/2-eps) of deletions with rate poly(eps) Our work is the first to achieve the qualitative goals of correcting a deletion fraction approaching 1 over bounded alphabets, and correcting a constant fraction of bit deletions with rate aproaching 1. The above results bring our understanding of deletion code constructions in these regimes to a similar level as worst-case errors

arXiv.org e-Print Archive

CiteSeerX

Dagstuhl Research Online Publication Server

Synchronization Strings: Explicit Constructions, Local Decoding, and Applications

Author: An
Fast
Guruswami Venkatesan
Haeupler Bernhard
Haeupler Bernhard
Haeupler Bernhard
Hemenway Brett
Sherstov Alexander A
Publication venue
Publication date: 09/11/2017
Field of study

This paper gives new results for synchronization strings, a powerful combinatorial object that allows to efficiently deal with insertions and deletions in various communication settings:

\bullet

We give a deterministic, linear time synchronization string construction, improving over an

O(n^5)

time randomized construction. Independently of this work, a deterministic

O(n\log^2\log n)

time construction was just put on arXiv by Cheng, Li, and Wu. We also give a deterministic linear time construction of an infinite synchronization string, which was not known to be computable before. Both constructions are highly explicit, i.e., the

i^{th}

symbol can be computed in

O(\log i)

time.

\bullet

This paper also introduces a generalized notion we call long-distance synchronization strings that allow for local and very fast decoding. In particular, only

O(\log^3 n)

time and access to logarithmically many symbols is required to decode any index. We give several applications for these results:

\bullet

For any

\delta0

we provide an insdel correcting code with rate

1-\delta-\epsilon

which can correct any

O(\delta)

fraction of insdel errors in

O(n\log^3n)

time. This near linear computational efficiency is surprising given that we do not even know how to compute the (edit) distance between the decoding input and output in sub-quadratic time. We show that such codes can not only efficiently recover from

\delta

fraction of insdel errors but, similar to [Schulman, Zuckerman; TransInf'99], also from any

O(\delta/\log n)

fraction of block transpositions and replications.

\bullet

We show that highly explicitness and local decoding allow for infinite channel simulations with exponentially smaller memory and decoding time requirements. These simulations can be used to give the first near linear time interactive coding scheme for insdel errors

arXiv.org e-Print Archive

Crossref

Near-Linear Time Insertion-Deletion Codes and (1+ $\varepsilon$ )-Approximating Edit Distance via Indexing

Author: Approximating
Efficiently
Goldwasser Shafi
Haeupler Bernhard
Haeupler Bernhard
Polylogarithmic
Selected
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 09/04/2019
Field of study

We introduce fast-decodable indexing schemes for edit distance which can be used to speed up edit distance computations to near-linear time if one of the strings is indexed by an indexing string

I

. In particular, for every length

n

and every

\varepsilon >0

, one can in near linear time construct a string

I \in \Sigma'^n

with

|\Sigma'| = O_{\varepsilon}(1)

, such that, indexing any string

S \in \Sigma^n

, symbol-by-symbol, with

I

results in a string

S' \in \Sigma''^n

where

\Sigma'' = \Sigma \times \Sigma'

for which edit distance computations are easy, i.e., one can compute a

(1+\varepsilon)

-approximation of the edit distance between

S'

and any other string in

O(n \text{poly}(\log n))

time. Our indexing schemes can be used to improve the decoding complexity of state-of-the-art error correcting codes for insertions and deletions. In particular, they lead to near-linear time decoding algorithms for the insertion-deletion codes of [Haeupler, Shahrasbi; STOC `17] and faster decoding algorithms for list-decodable insertion-deletion codes of [Haeupler, Shahrasbi, Sudan; ICALP `18]. Interestingly, the latter codes are a crucial ingredient in the construction of fast-decodable indexing schemes

arXiv.org e-Print Archive

Crossref

List Decoding Tensor Products and Interleaved Codes

Author: Gopalan Parikshit
Guruswami Venkatesan
Raghavendra Prasad
Publication venue
Publication date: 26/11/2008
Field of study

We design the first efficient algorithms and prove new combinatorial bounds for list decoding tensor products of codes and interleaved codes. We show that for {\em every} code, the ratio of its list decoding radius to its minimum distance stays unchanged under the tensor product operation (rather than squaring, as one might expect). This gives the first efficient list decoders and new combinatorial bounds for some natural codes including multivariate polynomials where the degree in each variable is bounded. We show that for {\em every} code, its list decoding radius remains unchanged under

m

-wise interleaving for an integer

m

. This generalizes a recent result of Dinur et al \cite{DGKS}, who proved such a result for interleaved Hadamard codes (equivalently, linear transformations). Using the notion of generalized Hamming weights, we give better list size bounds for {\em both} tensoring and interleaving of binary linear codes. By analyzing the weight distribution of these codes, we reduce the task of bounding the list size to bounding the number of close-by low-rank codewords. For decoding linear transformations, using rank-reduction together with other ideas, we obtain list size bounds that are tight over small fields.Comment: 32 page

arXiv.org e-Print Archive

CiteSeerX

Crossref

Synchronization Strings: Codes for Insertions and Deletions Approaching the Singleton Bound

Author: Braverman Mark
Haeupler Bernhard
Haeupler Bernhard
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 03/04/2017
Field of study

We introduce synchronization strings as a novel way of efficiently dealing with synchronization errors, i.e., insertions and deletions. Synchronization errors are strictly more general and much harder to deal with than commonly considered half-errors, i.e., symbol corruptions and erasures. For every

\epsilon >0

, synchronization strings allow to index a sequence with an

\epsilon^{-O(1)}

size alphabet such that one can efficiently transform

k

synchronization errors into

(1+\epsilon)k

half-errors. This powerful new technique has many applications. In this paper, we focus on designing insdel codes, i.e., error correcting block codes (ECCs) for insertion deletion channels. While ECCs for both half-errors and synchronization errors have been intensely studied, the later has largely resisted progress. Indeed, it took until 1999 for the first insdel codes with constant rate, constant distance, and constant alphabet size to be constructed by Schulman and Zuckerman. Insdel codes for asymptotically large or small noise rates were given in 2016 by Guruswami et al. but these codes are still polynomially far from the optimal rate-distance tradeoff. This makes the understanding of insdel codes up to this work equivalent to what was known for regular ECCs after Forney introduced concatenated codes in his doctoral thesis 50 years ago. A direct application of our synchronization strings based indexing method gives a simple black-box construction which transforms any ECC into an equally efficient insdel code with a slightly larger alphabet size. This instantly transfers much of the highly developed understanding for regular ECCs over large constant alphabets into the realm of insdel codes. Most notably, we obtain efficient insdel codes which get arbitrarily close to the optimal rate-distance tradeoff given by the Singleton bound for the complete noise spectrum

arXiv.org e-Print Archive

Crossref