4 research outputs found

    Protecting the Future of Information: LOCO Coding With Error Detection for DNA Data Storage

    Full text link
    DNA strands serve as a storage medium for 44-ary data over the alphabet {A,T,G,C}\{A,T,G,C\}. DNA data storage promises formidable information density, long-term durability, and ease of replicability. However, information in this intriguing storage technology might be corrupted. Experiments have revealed that DNA sequences with long homopolymers and/or with low GCGC-content are notably more subject to errors upon storage. This paper investigates the utilization of the recently-introduced method for designing lexicographically-ordered constrained (LOCO) codes in DNA data storage. This paper introduces DNA LOCO (D-LOCO) codes, over the alphabet {A,T,G,C}\{A,T,G,C\} with limited runs of identical symbols. These codes come with an encoding-decoding rule we derive, which provides affordable encoding-decoding algorithms. In terms of storage overhead, the proposed encoding-decoding algorithms outperform those in the existing literature. Our algorithms are readily reconfigurable. D-LOCO codes are intrinsically balanced, which allows us to achieve balancing over the entire DNA strand with minimal rate penalty. Moreover, we propose four schemes to bridge consecutive codewords, three of which guarantee single substitution error detection per codeword. We examine the probability of undetecting errors. We also show that D-LOCO codes are capacity-achieving and that they offer remarkably high rates at moderate lengths.Comment: 14 pages (double column), 3 figures, submitted to the IEEE Transactions on Molecular, Biological and Multi-scale Communications (TMBMC

    Eliminating Media Noise While Preserving Storage Capacity: Reconfigurable Constrained Codes for Two-Dimensional Magnetic Recording

    Full text link
    Magnetic recording devices are still competitive in the storage density race with solid-state devices thanks to new technologies such as two-dimensional magnetic recording (TDMR). Advanced data processing schemes are needed to guarantee reliability in TDMR. Data patterns where a bit is surrounded by complementary bits at the four positions with Manhattan distance 11 on the TDMR grid are called plus isolation (PIS) patterns, and they are error-prone. Recently, we introduced lexicographically-ordered constrained (LOCO) codes, namely optimal plus LOCO (OP-LOCO) codes, that prevent these patterns from being written in a TDMR device. However, in the high-density regime or the low-energy regime, additional error-prone patterns emerge, specifically data patterns where a bit is surrounded by complementary bits at only three positions with Manhattan distance 11, and we call them incomplete plus isolation (IPIS) patterns. In this paper, we present capacity-achieving codes that forbid both PIS and IPIS patterns in TDMR systems with wide read heads. We collectively call the PIS and IPIS patterns rotated T isolation (RTIS) patterns, and we call the new codes optimal T LOCO (OT-LOCO) codes. We analyze OT-LOCO codes and present their simple encoding-decoding rule that allows reconfigurability. We also present a novel bridging idea for these codes to further increase the rate. Our simulation results demonstrate that OT-LOCO codes are capable of eliminating media noise effects entirely at practical TD densities with high rates. To further preserve the storage capacity, we suggest using OP-LOCO codes early in the device lifetime, then employing the reconfiguration property to switch to OT-LOCO codes later. While the point of reconfiguration on the density/energy axis is decided manually at the moment, the next step is to use machine learning to take that decision based on the TDMR device status.Comment: 15 pages (double column), 11 figures, submitted to the IEEE Transactions on Magnetics (TMAG). arXiv admin note: text overlap with arXiv:2010.1068

    CHANNEL CODING TECHNIQUES FOR A MULTIPLE TRACK DIGITAL MAGNETIC RECORDING SYSTEM

    Get PDF
    In magnetic recording greater area) bit packing densities are achieved through increasing track density by reducing space between and width of the recording tracks, and/or reducing the wavelength of the recorded information. This leads to the requirement of higher precision tape transport mechanisms and dedicated coding circuitry. A TMS320 10 digital signal processor is applied to a standard low-cost, low precision, multiple-track, compact cassette tape recording system. Advanced signal processing and coding techniques are employed to maximise recording density and to compensate for the mechanical deficiencies of this system. Parallel software encoding/decoding algorithms have been developed for several Run-Length Limited modulation codes. The results for a peak detection system show that Bi-Phase L code can be reliably employed up to a data rate of 5kbits/second/track. Development of a second system employing a TMS32025 and sampling detection permitted the utilisation of adaptive equalisation to slim the readback pulse. Application of conventional read equalisation techniques, that oppose inter-symbol interference, resulted in a 30% increase in performance. Further investigation shows that greater linear recording densities can be achieved by employing Partial Response signalling and Maximum Likelihood Detection. Partial response signalling schemes use controlled inter-symbol interference to increase recording density at the expense of a multi-level read back waveform which results in an increased noise penalty. Maximum Likelihood Sequence detection employs soft decisions on the readback waveform to recover this loss. The associated modulation coding techniques required for optimised operation of such a system are discussed. Two-dimensional run-length-limited (d, ky) modulation codes provide a further means of increasing storage capacity in multi-track recording systems. For example the code rate of a single track run length-limited code with constraints (1, 3), such as Miller code, can be increased by over 25% when using a 4-track two-dimensional code with the same d constraint and with the k constraint satisfied across a number of parallel channels. The k constraint along an individual track, kx, can be increased without loss of clock synchronisation since the clocking information derived by frequent signal transitions can be sub-divided across a number of, y, parallel tracks in terms of a ky constraint. This permits more code words to be generated for a given (d, k) constraint in two dimensions than is possible in one dimension. This coding technique is furthered by development of a reverse enumeration scheme based on the trellis description of the (d, ky) constraints. The application of a two-dimensional code to a high linear density system employing extended class IV partial response signalling and maximum likelihood detection is proposed. Finally, additional coding constraints to improve spectral response and error performance are discussed.Hewlett Packard, Computer Peripherals Division (Bristol
    corecore