Search CORE

5,907 research outputs found

Guess & Check Codes for Deletions and Synchronization

Author: Hanna Serge Kas
Rouayheb Salim El
Publication venue
Publication date: 27/04/2017
Field of study

We consider the problem of constructing codes that can correct

\delta

deletions occurring in an arbitrary binary string of length

n

bits. Varshamov-Tenengolts (VT) codes can correct all possible single deletions

(\delta=1)

with an asymptotically optimal redundancy. Finding similar codes for

\delta \geq 2

deletions is an open problem. We propose a new family of codes, that we call Guess & Check (GC) codes, that can correct, with high probability, a constant number of deletions

\delta

occurring at uniformly random positions within an arbitrary string. The GC codes are based on MDS codes and have an asymptotically optimal redundancy that is

\Theta(\delta \log n)

. We provide deterministic polynomial time encoding and decoding schemes for these codes. We also describe the applications of GC codes to file synchronization.Comment: Accepted in ISIT 201

arXiv.org e-Print Archive

Crossref

Deletion codes in the high-noise and high-rate regimes

Author: Guruswami Venkatesan
Wang Carol
Publication venue
Publication date: 01/01/2014
Field of study

The noise model of deletions poses significant challenges in coding theory, with basic questions like the capacity of the binary deletion channel still being open. In this paper, we study the harder model of worst-case deletions, with a focus on constructing efficiently decodable codes for the two extreme regimes of high-noise and high-rate. Specifically, we construct polynomial-time decodable codes with the following trade-offs (for any eps > 0): (1) Codes that can correct a fraction 1-eps of deletions with rate poly(eps) over an alphabet of size poly(1/eps); (2) Binary codes of rate 1-O~(sqrt(eps)) that can correct a fraction eps of deletions; and (3) Binary codes that can be list decoded from a fraction (1/2-eps) of deletions with rate poly(eps) Our work is the first to achieve the qualitative goals of correcting a deletion fraction approaching 1 over bounded alphabets, and correcting a constant fraction of bit deletions with rate aproaching 1. The above results bring our understanding of deletion code constructions in these regimes to a similar level as worst-case errors

arXiv.org e-Print Archive

CiteSeerX

Dagstuhl Research Online Publication Server

A Proof of Entropy Minimization for Outputs in Deletion Channels via Hidden Word Statistics

Author: Atashpendar Arash
Mestel David
Roscoe A. W.
Ryan Peter Y. A.
Publication venue
Publication date: 30/07/2018
Field of study

From the output produced by a memoryless deletion channel from a uniformly random input of known length

n

, one obtains a posterior distribution on the channel input. The difference between the Shannon entropy of this distribution and that of the uniform prior measures the amount of information about the channel input which is conveyed by the output of length

m

, and it is natural to ask for which outputs this is extremized. This question was posed in a previous work, where it was conjectured on the basis of experimental data that the entropy of the posterior is minimized and maximized by the constant strings

\texttt{000}\ldots

and

\texttt{111}\ldots

and the alternating strings

\texttt{0101}\ldots

and

\texttt{1010}\ldots

respectively. In the present work we confirm the minimization conjecture in the asymptotic limit using results from hidden word statistics. We show how the analytic-combinatorial methods of Flajolet, Szpankowski and Vall\'ee for dealing with the hidden pattern matching problem can be applied to resolve the case of fixed output length and

n\rightarrow\infty

, by obtaining estimates for the entropy in terms of the moments of the posterior distribution and establishing its minimization via a measure of autocorrelation.Comment: 11 pages, 2 figure

arXiv.org e-Print Archive

Open Repository and Bibliography - Luxembourg