972 research outputs found

    A construction for balancing non-binary sequences based on gray code prefixes

    Get PDF
    Abstract: We introduce a new construction for the balancing of non-binary sequences that make use of Gray codes for prefix coding. Our construction provides full encoding and decoding of sequences, including the prefix. This construction is based on a generalization of Knuth’s parallel balancing approach, which can handle very long information sequences. However, the overall sequence—composed of the information sequence, together with the prefix—must be balanced. This is reminiscent of Knuth’s serial algorithm. The encoding of our construction does not make use of lookup tables, while the decoding process is simple and can be done in parallel

    Improving the redundancy of Knuth's balancing scheme for packet transmission systems

    Full text link
    A simple scheme was proposed by Knuth to generate binary balanced codewords from any information word. However, this method is limited in the sense that its redundancy is twice that of the full sets of balanced codes. The gap between Knuth's algorithm's redundancy and that of the full sets of balanced codes is significantly considerable. This paper attempts to reduce that gap. Furthermore, many constructions assume that a full balancing can be performed without showing the steps. A full balancing refers to the overall balancing of the encoded information together with the prefix. We propose an efficient way to perform a full balancing scheme that does not make use of lookup tables or enumerative coding.Comment: 11 pages, 4 figures, journal article submitted to Turkish journal of electrical and computer science

    Encoding and Decoding of Balanced q-ary sequences using a gray code prefix

    Get PDF
    Abstract: Balancing sequences over a non-binary alphabet is considered, where the algebraic sum of the components (also known as the weight) is equal to some specific value. Various schemes based on Knuth’s simple binary balancing algorithm have been proposed. However, these have mostly assumed that the prefix describing the balancing point in the algorithm can easily be encoded. In this paper we show how non-binary Gray codes can be used to generate these prefixes. Together with a non-binary balancing algorithm, this forms a complete balancing system with straightforward and efficient encoding/decoding

    Construction of efficient q-ary balanced codes

    Get PDF
    Abstract : Abstract—Knuth proposed a simple scheme for balancing codewords, which was later extended for generating q-ary balanced codewords. The redundancy of existing schemes for balancing q-ary sequences is larger than that of the full balanced set which is the minimum achievable redundancy. In this article, we present a simple and efficient method to encode the prefix that results in less redundancy for the construction of q-ary balanced codewords

    Binary balanced codes approaching capacity

    Get PDF
    Abstract: In this paper, the construction of binary balanced codes is revisited. Binary balanced codes refer to sets of bipolar codewords where the number of “1”s in each codeword equals that of “0”s. The first algorithm for balancing codes was proposed by Knuth in 1986; however, its redundancy is almost two times larger than that of the full set of balanced codewords. We will present an efficient and simple construction with a redundancy approaching the minimal achievable one

    On the improvement of the Knuth’s redundancy algorithm for balancing codes

    Get PDF
    Abstract: A simple scheme was proposed by Knuth to generate balanced codewords from a random binary information sequence. However, this method presents a redundancy which is twice as that of the full sets of balanced codewords, that is the minimal achievable redundancy. The gap between the Knuth’s algorithm generated redundancy and the minimal one is significantly considerable and can be reduced. This paper attempts to achieve this goal through a method based on information sequence candidates

    DNA–based data storage system

    Get PDF
    Despite the many advances in traditional data recording techniques, the surge of Big Data platforms and energy conservation issues has imposed new challenges to the storage community in terms of identifying extremely high volume, non-volatile and durable recording media. The potential for using macromolecules for ultra-dense storage was recognized as early as 1959 when Richard Feynman outlined his vision for nanotechnology in a lecture, “There is plenty of room at the bottom”. Among known macromolecules, DNA is unique insofar as it lends itself to implementations of non-volatile recording media of outstanding integrity and extremely high storage capacity. The basic system implementation steps for DNA-based data storage systems include synthesizing DNA strings that contain user information and subsequently retrieving them via high-throughput sequencing technologies. Existing architectures enable reading and writing but do not offer random-access and error-free data recovery from low-cost, portable devices, which is crucial for making the storage technology competitive with classical recorders. In this work we advance the field of macromolecular data storage in three directions. First, we introduce the notion of weakly mutually uncorrelated (WMU) sequences. WMU sequences are characterized by the property that no sufficiently long suffix of one sequence is the prefix of the same or another sequence. For this purpose, WMU sequences used for primer design in DNAbased data storage systems are also required to be at large mutual Hamming distance from each other, have balanced compositions of symbols, and avoid primer-dimer byproducts. We derive bounds on the size of WMU and various constrained WMU codes and present a number of constructions for balanced, error-correcting, primer-dimer free WMU codes using Dyck paths, prefixsynchronized and cyclic codes. Second, we describe the first DNA-based storage architecture that enables random access to data blocks and rewriting of information stored at arbitrary locations within the blocks. The newly developed architecture overcomes drawbacks of existing read-only methods that require decoding the whole file in order to read one data fragment. Our system is based on the newly developed WMU coding techniques and accompanying DNA editing methods that ensure data reliability, specificity and sensitivity of access, and at the same time provide exceptionally high data storage capacity. As a proof of concept, we encoded parts of the Wikipedia pages of six universities in the USA, and selected and edited parts of the text written in DNA corresponding to three of these schools. The results suggest that DNA is a versatile media suitable for both ultrahigh density archival and rewritable storage applications. Third, we demonstrate for the first time that a portable, random-access platform may be implemented in practice using nanopore sequencers. Every solution for DNA-based data storage systems so far has exclusively focused on Illumina sequencing devices, but such sequencers are expensive and designed for laboratory use only. Instead, we propose using a new technology, MinION–Oxford Nanopore’s handheld sequencer. Nanopore sequencing is fast and cheap, but it results in reads with high error rates. To deal with this issue, we designed an integrated processing pipeline that encodes data to avoid costly synthesis and sequencing errors, enables random access through addressing, and leverages efficient portable sequencing via new iterative alignment and deletion error-correcting codes. As a proof of concept, we stored and sequenced around 3.6 kB of binary data that includes two compressed images (a Citizen Kane poster and a smiley face emoji), using a portable data storage system, and obtained error-free read-outs
    • …
    corecore