12 research outputs found
Reconstruction Codes for DNA Sequences with Uniform Tandem-Duplication Errors
DNA as a data storage medium has several advantages, including far greater
data density compared to electronic media. We propose that schemes for data
storage in the DNA of living organisms may benefit from studying the
reconstruction problem, which is applicable whenever multiple reads of noisy
data are available. This strategy is uniquely suited to the medium, which
inherently replicates stored data in multiple distinct ways, caused by
mutations. We consider noise introduced solely by uniform tandem-duplication,
and utilize the relation to constant-weight integer codes in the Manhattan
metric. By bounding the intersection of the cross-polytope with hyperplanes, we
prove the existence of reconstruction codes with greater capacity than known
error-correcting codes, which we can determine analytically for any set of
parameters.Comment: 11 pages, 2 figures, Latex; version accepted for publicatio
Recommended from our members
Systematic Codes for Correcting Deletion/Insertion of One Zero in Each and Every Bucket of Zeros
In this thesis, we propose a systematic code for correcting t = 1 insertion/deletion errors of the character ”0” that can occur between any two consecutive 1’s in a binary string. The code requires balanced input strings, where each word of length n contains ⌈n/2⌉ 0’s and ⌊n/2⌋ 1’s. This error model is shown to be related to zero-error capacity-achieving codes for a limited-magnitude error channel. We prove that the inputs can be partitioned in to different subsets and the words in the same subset can be assigned a unique check for this error model. We deduce that the upper bound for the number of checks required is 2w, where w is the weight of the input. Efficient encoding and decoding algorithms are provided. Our algorithms return variable-length checks and may require up to r = 3w check bits. While the optimal rate for this error model is not known, we establish our rate to be between 0.4 and 0.666 and demonstrate potential avenues for improvement
On Conflict Free DNA Codes
DNA storage has emerged as an important area of research. The reliability of
DNA storage system depends on designing the DNA strings (called DNA codes) that
are sufficiently dissimilar. In this work, we introduce DNA codes that satisfy
a special constraint. Each codeword of the DNA code has a specific property
that any two consecutive sub-strings of the DNA codeword will not be the same
(a generalization of homo-polymers constraint). This is in addition to the
usual constraints such as Hamming, reverse, reverse-complement and
-content. We believe that the new constraint will help further in reducing
the errors during reading and writing data into the synthetic DNA strings. We
also present a construction (based on a variant of stochastic local search
algorithm) to calculate the size of the DNA codes with all the above
constraints, which improves the lower bounds from the existing literature, for
some specific cases. Moreover, a recursive isometric map between binary vectors
and DNA strings is proposed. Using the map and the well known binary codes we
obtain few classes of DNA codes with all the constraints including the property
that the constructed DNA codewords are free from the hairpin-like secondary
structures.Comment: 12 pages, Draft (Table VI and Table VII are updated
Codes for Correcting Asymmetric Adjacent Transpositions and Deletions
Codes in the Damerau--Levenshtein metric have been extensively studied
recently owing to their applications in DNA-based data storage. In particular,
Gabrys, Yaakobi, and Milenkovic (2017) designed a length- code correcting a
single deletion and adjacent transpositions with at most
bits of redundancy. In this work, we consider a new setting where both
asymmetric adjacent transpositions (also known as right-shifts or left-shifts)
and deletions may occur. We present several constructions of the codes
correcting these errors in various cases. In particular, we design a code
correcting a single deletion, right-shift, and left-shift errors
with at most bits of redundancy where . In
addition, we investigate codes correcting -deletions, right-shift,
and left-shift errors with both uniquely-decoding and list-decoding
algorithms. Our main contribution here is the construction of a list-decodable
code with list size and with at most bits of redundancy, where . Finally, we construct
both non-systematic and systematic codes for correcting blocks of -deletions
with -limited-magnitude and adjacent transpositions