Low-redundancy codes for correcting multiple short-duplication and edit
  errors

Farnoud, Farzad; Gabrys, Ryan; Lou, Hao; Tang, Yuanyuan; Wang, Shuche

Low-redundancy codes for correcting multiple short-duplication and edit errors

Authors: Farzad Farnoud
Ryan Gabrys
Hao Lou
Yuanyuan Tang
Shuche Wang
Publication date: 3 August 2022
Publisher

Abstract

Due to its higher data density, longevity, energy efficiency, and ease of generating copies, DNA is considered a promising storage technology for satisfying future needs. However, a diverse set of errors including deletions, insertions, duplications, and substitutions may arise in DNA at different stages of data storage and retrieval. The current paper constructs error-correcting codes for simultaneously correcting short (tandem) duplications and at most

p

edits, where a short duplication generates a copy of a substring with length

\leq 3

and inserts the copy following the original substring, and an edit is a substitution, deletion, or insertion. Compared to the state-of-the-art codes for duplications only, the proposed codes correct up to

p

edits (in addition to duplications) at the additional cost of roughly

8p(\log_q n)(1+o(1))

symbols of redundancy, thus achieving the same asymptotic rate, where

q\ge 4

is the alphabet size and

p

is a constant. Furthermore, the time complexities of both the encoding and decoding processes are polynomial when

p

is a constant with respect to the code length.Comment: 21 pages. The paper has been submitted to IEEE Transaction on Information Theory. Furthermore, the paper was presented in part at the ISIT2021 and ISIT202

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2208.02330

Last time updated on 06/10/2022