53 research outputs found

    Non-asymptotic Upper Bounds for Deletion Correcting Codes

    Full text link
    Explicit non-asymptotic upper bounds on the sizes of multiple-deletion correcting codes are presented. In particular, the largest single-deletion correcting code for qq-ary alphabet and string length nn is shown to be of size at most qnq(q1)(n1)\frac{q^n-q}{(q-1)(n-1)}. An improved bound on the asymptotic rate function is obtained as a corollary. Upper bounds are also derived on sizes of codes for a constrained source that does not necessarily comprise of all strings of a particular length, and this idea is demonstrated by application to sets of run-length limited strings. The problem of finding the largest deletion correcting code is modeled as a matching problem on a hypergraph. This problem is formulated as an integer linear program. The upper bound is obtained by the construction of a feasible point for the dual of the linear programming relaxation of this integer linear program. The non-asymptotic bounds derived imply the known asymptotic bounds of Levenshtein and Tenengolts and improve on known non-asymptotic bounds. Numerical results support the conjecture that in the binary case, the Varshamov-Tenengolts codes are the largest single-deletion correcting codes.Comment: 18 pages, 4 figure

    On the Varshamov-Tenengolts Construction on Binary Strings

    Get PDF
    Abstract This paper is motivated by the problem of finding the largest single-deletion-correcting code for binary strings. The Varshamov-Tenengolts construction classifies binary strings into non-overlapping sets, the largest set of these is asymptotically the largest single-deletion-correcting code. However despite the asymptotic optimality little is known about the quality of the construction as a function of the string length. We show that these sets are also responsible for the (near) solution of several combinatorial problems on a certain hypergraph. Furthermore our results are valid for any string length. We show that the sets collectively solve strong vertex coloring and edge coloring on the hypergraph exactly. For any string length n we show that the largest of these sets is within n+1 n−1 of optimal matching on the hypergraph, which also corresponds to the largest single-deletion-correcting code. Moreover, we show for any n the smallest of these sets is within of the smallest cover of this hypergraph and that each of these sets is a perfect matching. We then obtain similar results on the dual of this hypergraph

    Guess & Check Codes for Deletions and Synchronization

    Full text link
    We consider the problem of constructing codes that can correct δ\delta deletions occurring in an arbitrary binary string of length nn bits. Varshamov-Tenengolts (VT) codes can correct all possible single deletions (δ=1)(\delta=1) with an asymptotically optimal redundancy. Finding similar codes for δ2\delta \geq 2 deletions is an open problem. We propose a new family of codes, that we call Guess & Check (GC) codes, that can correct, with high probability, a constant number of deletions δ\delta occurring at uniformly random positions within an arbitrary string. The GC codes are based on MDS codes and have an asymptotically optimal redundancy that is Θ(δlogn)\Theta(\delta \log n). We provide deterministic polynomial time encoding and decoding schemes for these codes. We also describe the applications of GC codes to file synchronization.Comment: Accepted in ISIT 201
    corecore