5 research outputs found
Anchor-Based Correction of Substitutions in Indexed Sets
Motivated by DNA-based data storage, we investigate a system where digital
information is stored in an unordered set of several vectors over a finite
alphabet. Each vector begins with a unique index that represents its position
in the whole data set and does not contain data. This paper deals with the
design of error-correcting codes for such indexed sets in the presence of
substitution errors. We propose a construction that efficiently deals with the
challenges that arise when designing codes for unordered sets. Using a novel
mechanism, called anchoring, we show that it is possible to combat the ordering
loss of sequences with only a small amount of redundancy, which allows to use
standard coding techniques, such as tensor-product codes to correct errors
within the sequences. We finally derive upper and lower bounds on the
achievable redundancy of codes within the considered channel model and verify
that our construction yields a redundancy that is close to the best possible
achievable one. Our results surprisingly indicate that it requires less
redundancy to correct errors in the indices than in the data part of vectors.Comment: 5 page
Robust Indexing for the Sliced Channel: Almost Optimal Codes for Substitutions and Deletions
Encoding data as a set of unordered strings is receiving great attention as
it captures one of the basic features of DNA storage systems. However, the
challenge of constructing optimal redundancy codes for this channel remained
elusive. In this paper, we address this problem and present an order-wise
optimal construction of codes that are capable of correcting multiple
substitution, deletion, and insertion errors for this channel model. The key
ingredient in the code construction is a technique we call robust indexing:
simultaneously assigning indices to unordered strings (hence, creating order)
and also embedding information in these indices.
The encoded indices are resilient to substitution, deletion, and insertion
errors, and therefore, so is the entire code