9 research outputs found

    On Codes for the Noisy Substring Channel

    Full text link
    We consider the problem of coding for the substring channel, in which information strings are observed only through their (multisets of) substrings. Because of applications to DNA-based data storage, due to DNA sequencing techniques, interest in this channel has renewed in recent years. In contrast to existing literature, we consider a noisy channel model, where information is subject to noise \emph{before} its substrings are sampled, motivated by in-vivo storage. We study two separate noise models, substitutions or deletions. In both cases, we examine families of codes which may be utilized for error-correction and present combinatorial bounds. Through a generalization of the concept of repeat-free strings, we show that the added required redundancy due to this imperfect observation assumption is sublinear, either when the fraction of errors in the observed substring length is sufficiently small, or when that length is sufficiently long. This suggests that no asymptotic cost in rate is incurred by this channel model in these cases.Comment: ISIT 2021 version (including all proofs

    Snake-in-the-Box Codes for Rank Modulation

    Full text link
    Motivated by the rank-modulation scheme with applications to flash memory, we consider Gray codes capable of detecting a single error, also known as snake-in-the-box codes. We study two error metrics: Kendall's Ο„\tau-metric, which applies to charge-constrained errors, and the β„“βˆž\ell_\infty-metric, which is useful in the case of limited magnitude errors. In both cases we construct snake-in-the-box codes with rate asymptotically tending to 1. We also provide efficient successor-calculation functions, as well as ranking and unranking functions. Finally, we also study bounds on the parameters of such codes

    Generalized Unique Reconstruction from Substrings

    Full text link
    This paper introduces a new family of reconstruction codes which is motivated by applications in DNA data storage and sequencing. In such applications, DNA strands are sequenced by reading some subset of their substrings. While previous works considered two extreme cases in which all substrings of pre-defined lengths are read or substrings are read with no overlap for the single string case, this work studies two extensions of this paradigm. The first extension considers the setup in which consecutive substrings are read with some given minimum overlap. First, an upper bound is provided on the attainable rates of codes that guarantee unique reconstruction. Then, efficient constructions of codes that asymptotically meet that upper bound are presented. In the second extension, we study the setup where multiple strings are reconstructed together. Given the number of strings and their length, we first derive a lower bound on the read substrings' length β„“\ell that is necessary for the existence of multi-strand reconstruction codes with non-vanishing rates. We then present two constructions of such codes and show that their rates approach 1 for values of β„“\ell that asymptotically behave like the lower bound.Comment: arXiv admin note: text overlap with arXiv:2205.0393

    Adversarial Torn-paper Codes

    Full text link
    We study the adversarial torn-paper channel. This problem is motivated by applications in DNA data storage where the DNA strands that carry information may break into smaller pieces which are received out of order. Our model extends the previously researched probabilistic setting to the worst-case. We develop code constructions for any parameters of the channel for which non-vanishing asymptotic rate is possible and show our constructions achieve asymptotically optimal rate while allowing for efficient encoding and decoding. Finally, we extend our results to related settings included multi-strand storage, presence of substitution errors, or incomplete coverage.Comment: Journal submissio

    Single-Error Detection and Correction for Duplication and Substitution Channels

    No full text
    corecore