5 research outputs found

    Improved constructions of permutation and multi-permutation codes correcting a burst of stable deletions

    Full text link
    Permutation codes and multi-permutation codes have been widely considered due to their various applications, especially in flash memory. In this paper, we consider permutation codes and multi-permutation codes against a burst of stable deletions. In particular, we propose a construction of permutation codes correcting a burst stable deletion of length ss, with redundancy logā”n+2logā”logā”n+O(1)\log n+ 2\log \log n+O(1). Compared to the previous known results, our improvement relies on a different strategy to retrieve the missing symbol on the first row of the array representation of a permutation. We also generalize our constructions for multi-permutations and the variable length burst model. Furthermore, we propose a linear-time encoder with optimal redundancy for single stable deletion correcting permutation codes.Comment: Accepted for publication in IEEE Trans. Inf. Theor

    Codes Correcting Erasures and Deletions for Rank Modulation

    No full text
    Error-correcting codes for permutations have received considerable attention in the past few years, especially in applications of the rank modulation scheme for flash memories. While codes over several metrics have been studied, such as the Kendall Ļ„, Ulam, and Hamming distances, no recent research has been carried out for erasures and deletions over permutations. In rank modulation, flash memory cells represent a permutation, which is induced by their relative charge levels. We explore problems that arise when some of the cells are either erased or deleted. In each case, we study how these erasures and deletions affect the information carried by the remaining cells. In particular, we study models that are symbol-invariant, where unaffected elements do not change their corresponding values from those in the original permutation, or permutation-invariant, where the remaining symbols are modified to form a new permutation with fewer elements. Our main approach in tackling these problems is to build upon the existing works of error-correcting codes and leverage them in order to construct codes in each model of deletions and erasures. The codes we develop are in certain cases asymptotically optimal, while in other cases, such as for codes in the Ulam distance, improve upon the state of the art results

    Codes correcting erasures and deletions for rank modulation

    No full text

    DNAā€“based data storage system

    Get PDF
    Despite the many advances in traditional data recording techniques, the surge of Big Data platforms and energy conservation issues has imposed new challenges to the storage community in terms of identifying extremely high volume, non-volatile and durable recording media. The potential for using macromolecules for ultra-dense storage was recognized as early as 1959 when Richard Feynman outlined his vision for nanotechnology in a lecture, ā€œThere is plenty of room at the bottomā€. Among known macromolecules, DNA is unique insofar as it lends itself to implementations of non-volatile recording media of outstanding integrity and extremely high storage capacity. The basic system implementation steps for DNA-based data storage systems include synthesizing DNA strings that contain user information and subsequently retrieving them via high-throughput sequencing technologies. Existing architectures enable reading and writing but do not offer random-access and error-free data recovery from low-cost, portable devices, which is crucial for making the storage technology competitive with classical recorders. In this work we advance the field of macromolecular data storage in three directions. First, we introduce the notion of weakly mutually uncorrelated (WMU) sequences. WMU sequences are characterized by the property that no sufficiently long suffix of one sequence is the prefix of the same or another sequence. For this purpose, WMU sequences used for primer design in DNAbased data storage systems are also required to be at large mutual Hamming distance from each other, have balanced compositions of symbols, and avoid primer-dimer byproducts. We derive bounds on the size of WMU and various constrained WMU codes and present a number of constructions for balanced, error-correcting, primer-dimer free WMU codes using Dyck paths, prefixsynchronized and cyclic codes. Second, we describe the first DNA-based storage architecture that enables random access to data blocks and rewriting of information stored at arbitrary locations within the blocks. The newly developed architecture overcomes drawbacks of existing read-only methods that require decoding the whole file in order to read one data fragment. Our system is based on the newly developed WMU coding techniques and accompanying DNA editing methods that ensure data reliability, specificity and sensitivity of access, and at the same time provide exceptionally high data storage capacity. As a proof of concept, we encoded parts of the Wikipedia pages of six universities in the USA, and selected and edited parts of the text written in DNA corresponding to three of these schools. The results suggest that DNA is a versatile media suitable for both ultrahigh density archival and rewritable storage applications. Third, we demonstrate for the first time that a portable, random-access platform may be implemented in practice using nanopore sequencers. Every solution for DNA-based data storage systems so far has exclusively focused on Illumina sequencing devices, but such sequencers are expensive and designed for laboratory use only. Instead, we propose using a new technology, MinIONā€“Oxford Nanoporeā€™s handheld sequencer. Nanopore sequencing is fast and cheap, but it results in reads with high error rates. To deal with this issue, we designed an integrated processing pipeline that encodes data to avoid costly synthesis and sequencing errors, enables random access through addressing, and leverages efficient portable sequencing via new iterative alignment and deletion error-correcting codes. As a proof of concept, we stored and sequenced around 3.6 kB of binary data that includes two compressed images (a Citizen Kane poster and a smiley face emoji), using a portable data storage system, and obtained error-free read-outs
    corecore