7 research outputs found
Low-redundancy codes for correcting multiple short-duplication and edit errors
Due to its higher data density, longevity, energy efficiency, and ease of
generating copies, DNA is considered a promising storage technology for
satisfying future needs. However, a diverse set of errors including deletions,
insertions, duplications, and substitutions may arise in DNA at different
stages of data storage and retrieval. The current paper constructs
error-correcting codes for simultaneously correcting short (tandem)
duplications and at most edits, where a short duplication generates a copy
of a substring with length and inserts the copy following the original
substring, and an edit is a substitution, deletion, or insertion. Compared to
the state-of-the-art codes for duplications only, the proposed codes correct up
to edits (in addition to duplications) at the additional cost of roughly
symbols of redundancy, thus achieving the same
asymptotic rate, where is the alphabet size and is a constant.
Furthermore, the time complexities of both the encoding and decoding processes
are polynomial when is a constant with respect to the code length.Comment: 21 pages. The paper has been submitted to IEEE Transaction on
Information Theory. Furthermore, the paper was presented in part at the
ISIT2021 and ISIT202