5 research outputs found

    Codes for DNA Storage Channels

    Full text link
    We consider the problem of assembling a sequence based on a collection of its substrings observed through a noisy channel. The mathematical basis of the problem is the construction and design of sequences that may be discriminated based on a collection of their substrings observed through a noisy channel. We explain the connection between the sequence reconstruction problem and the problem of DNA synthesis and sequencing, and introduce the notion of a DNA storage channel. We analyze the number of sequence equivalence classes under the channel mapping and propose new asymmetric coding techniques to combat the effects of synthesis and sequencing noise. In our analysis, we make use of restricted de Bruijn graphs and Ehrhart theory for rational polytopes.Comment: 32 pages, 5 figure

    On Optimal Family of Codes for Archival DNA Storage

    Full text link
    DNA based storage systems received attention by many researchers. This includes archival and re-writable random access DNA based storage systems. In this work, we have developed an efficient technique to encode the data into DNA sequence by using non-linear families of ternary codes. In particular, we proposes an algorithm to encode data into DNA with high information storage density and better error correction using a sub code of Golay code. Theoretically, 115 exabytes (EB) data can be stored in one gram of DNA by our method.Comment: Supplementary file and the software DNA Cloud 2.0 is available at http://www.guptalab.org/dnacloud This is the preliminary version of the paper that appeared in Proceedings of IWSDA 2015, pp. 143--14
    corecore