4 research outputs found
Protecting the Future of Information: LOCO Coding With Error Detection for DNA Data Storage
DNA strands serve as a storage medium for -ary data over the alphabet
. DNA data storage promises formidable information density,
long-term durability, and ease of replicability. However, information in this
intriguing storage technology might be corrupted. Experiments have revealed
that DNA sequences with long homopolymers and/or with low -content are
notably more subject to errors upon storage.
This paper investigates the utilization of the recently-introduced method for
designing lexicographically-ordered constrained (LOCO) codes in DNA data
storage. This paper introduces DNA LOCO (D-LOCO) codes, over the alphabet
with limited runs of identical symbols. These codes come with an
encoding-decoding rule we derive, which provides affordable encoding-decoding
algorithms. In terms of storage overhead, the proposed encoding-decoding
algorithms outperform those in the existing literature. Our algorithms are
readily reconfigurable. D-LOCO codes are intrinsically balanced, which allows
us to achieve balancing over the entire DNA strand with minimal rate penalty.
Moreover, we propose four schemes to bridge consecutive codewords, three of
which guarantee single substitution error detection per codeword. We examine
the probability of undetecting errors. We also show that D-LOCO codes are
capacity-achieving and that they offer remarkably high rates at moderate
lengths.Comment: 14 pages (double column), 3 figures, submitted to the IEEE
Transactions on Molecular, Biological and Multi-scale Communications (TMBMC
Eliminating Media Noise While Preserving Storage Capacity: Reconfigurable Constrained Codes for Two-Dimensional Magnetic Recording
Magnetic recording devices are still competitive in the storage density race
with solid-state devices thanks to new technologies such as two-dimensional
magnetic recording (TDMR). Advanced data processing schemes are needed to
guarantee reliability in TDMR. Data patterns where a bit is surrounded by
complementary bits at the four positions with Manhattan distance on the
TDMR grid are called plus isolation (PIS) patterns, and they are error-prone.
Recently, we introduced lexicographically-ordered constrained (LOCO) codes,
namely optimal plus LOCO (OP-LOCO) codes, that prevent these patterns from
being written in a TDMR device. However, in the high-density regime or the
low-energy regime, additional error-prone patterns emerge, specifically data
patterns where a bit is surrounded by complementary bits at only three
positions with Manhattan distance , and we call them incomplete plus
isolation (IPIS) patterns. In this paper, we present capacity-achieving codes
that forbid both PIS and IPIS patterns in TDMR systems with wide read heads. We
collectively call the PIS and IPIS patterns rotated T isolation (RTIS)
patterns, and we call the new codes optimal T LOCO (OT-LOCO) codes. We analyze
OT-LOCO codes and present their simple encoding-decoding rule that allows
reconfigurability. We also present a novel bridging idea for these codes to
further increase the rate. Our simulation results demonstrate that OT-LOCO
codes are capable of eliminating media noise effects entirely at practical TD
densities with high rates. To further preserve the storage capacity, we suggest
using OP-LOCO codes early in the device lifetime, then employing the
reconfiguration property to switch to OT-LOCO codes later. While the point of
reconfiguration on the density/energy axis is decided manually at the moment,
the next step is to use machine learning to take that decision based on the
TDMR device status.Comment: 15 pages (double column), 11 figures, submitted to the IEEE
Transactions on Magnetics (TMAG). arXiv admin note: text overlap with
arXiv:2010.1068
CHANNEL CODING TECHNIQUES FOR A MULTIPLE TRACK DIGITAL MAGNETIC RECORDING SYSTEM
In magnetic recording greater area) bit packing densities are achieved through increasing
track density by reducing space between and width of the recording tracks, and/or
reducing the wavelength of the recorded information. This leads to the requirement of
higher precision tape transport mechanisms and dedicated coding circuitry.
A TMS320 10 digital signal processor is applied to a standard low-cost, low precision,
multiple-track, compact cassette tape recording system. Advanced signal processing and
coding techniques are employed to maximise recording density and to compensate for
the mechanical deficiencies of this system. Parallel software encoding/decoding
algorithms have been developed for several Run-Length Limited modulation codes. The
results for a peak detection system show that Bi-Phase L code can be reliably employed
up to a data rate of 5kbits/second/track. Development of a second system employing a
TMS32025 and sampling detection permitted the utilisation of adaptive equalisation to
slim the readback pulse. Application of conventional read equalisation techniques, that
oppose inter-symbol interference, resulted in a 30% increase in performance.
Further investigation shows that greater linear recording densities can be achieved by
employing Partial Response signalling and Maximum Likelihood Detection. Partial
response signalling schemes use controlled inter-symbol interference to increase
recording density at the expense of a multi-level read back waveform which results in an
increased noise penalty. Maximum Likelihood Sequence detection employs soft
decisions on the readback waveform to recover this loss. The associated modulation
coding techniques required for optimised operation of such a system are discussed.
Two-dimensional run-length-limited (d, ky) modulation codes provide a further means of
increasing storage capacity in multi-track recording systems. For example the code rate
of a single track run length-limited code with constraints (1, 3), such as Miller code, can
be increased by over 25% when using a 4-track two-dimensional code with the same d
constraint and with the k constraint satisfied across a number of parallel channels. The k
constraint along an individual track, kx, can be increased without loss of clock
synchronisation since the clocking information derived by frequent signal transitions
can be sub-divided across a number of, y, parallel tracks in terms of a ky constraint. This
permits more code words to be generated for a given (d, k) constraint in two dimensions
than is possible in one dimension. This coding technique is furthered by development of
a reverse enumeration scheme based on the trellis description of the (d, ky) constraints.
The application of a two-dimensional code to a high linear density system employing
extended class IV partial response signalling and maximum likelihood detection is
proposed. Finally, additional coding constraints to improve spectral response and error
performance are discussed.Hewlett Packard, Computer Peripherals Division (Bristol