8 research outputs found

    Efficient Systematic Encoding of Non-binary VT Codes

    Full text link
    Varshamov-Tenengolts (VT) codes are a class of codes which can correct a single deletion or insertion with a linear-time decoder. This paper addresses the problem of efficient encoding of non-binary VT codes, defined over an alphabet of size q>2q >2. We propose a simple linear-time encoding method to systematically map binary message sequences onto VT codewords. The method provides a new lower bound on the size of qq-ary VT codes of length nn.Comment: This paper will appear in the proceedings of ISIT 201

    t-Deletion-s-Insertion-Burst Correcting Codes

    Full text link
    Motivated by applications in DNA-based storage and communication systems, we study deletion and insertion errors simultaneously in a burst. In particular, we study a type of error named tt-deletion-ss-insertion-burst ((t,s)(t,s)-burst for short) which is a generalization of the (2,1)(2,1)-burst error proposed by Schoeny {\it et. al}. Such an error deletes tt consecutive symbols and inserts an arbitrary sequence of length ss at the same coordinate. We provide a sphere-packing upper bound on the size of binary codes that can correct a (t,s)(t,s)-burst error, showing that the redundancy of such codes is at least log⁑n+tβˆ’1\log n+t-1. For tβ‰₯2st\geq 2s, an explicit construction of binary (t,s)(t,s)-burst correcting codes with redundancy log⁑n+(tβˆ’sβˆ’1)log⁑log⁑n+O(1)\log n+(t-s-1)\log\log n+O(1) is given. In particular, we construct a binary (3,1)(3,1)-burst correcting code with redundancy at most log⁑n+9\log n+9, which is optimal up to a constant.Comment: Part of this work (the (t,1)-burst model) was presented at ISIT2022. This full version has been submitted to IEEE-IT in August 202

    Scalable string reconciliation by recursive content-dependent shingling

    Get PDF
    We consider the problem of reconciling similar strings in a distributed system. Specifically, we are interested in performing this reconciliation in an efficient manner, minimizing the communication cost. Our problem applies to several types of large-scale distributed networks, file synchronization utilities, and any system that manages the consistency of string encoded ordered data. We present the novel Recursive Content-Dependent Shingling (RCDS) protocol that can handle large strings and has the communication complexity that scales with the edit distance between the reconciling strings. Also, we provide analysis, experimental results, and comparisons to existing synchronization software such as the Rsync utility with an implementation of our protocol.2019-12-03T00:00:00
    corecore