11,783 research outputs found

    Linear constructions for DNA codes

    Get PDF
    AbstractIn this paper we translate in terms of coding theory constraints that are used in designing DNA codes for use in DNA computing or as bar-codes in chemical libraries. We propose new constructions for DNA codes satisfying either a reverse-complement constraint, a GC-content constraint, or both, that are derived from additive and linear codes over four-letter alphabets. We focus in particular on codes over GF(4), and we construct new DNA codes that are in many cases better (sometimes far better) than previously known codes. We provide updated tables up to length 20 that include these codes as well as new codes constructed using a combination of lexicographic techniques and stochastic search

    Linear and nonlinear constructions of DNA codes with Hamming distance d, constant GC-content and a reverse-complement constraint

    Get PDF
    AbstractIn a previous paper, the authors used cyclic and extended cyclic constructions to obtain codes over an alphabet {A,C,G,T} satisfying a Hamming distance constraint and a GC-content constraint. These codes are applicable to the design of synthetic DNA strands used in DNA microarrays, as DNA tags in chemical libraries and in DNA computing. The GC-content constraint specifies that a fixed number of positions are G or C in each codeword, which ensures uniform melting temperatures. The Hamming distance constraint is a step towards avoiding unwanted hybridizations. This approach extended the pioneering work of Gaborit and King. In the current paper, another constraint known as a reverse-complement constraint is added to further prevent unwanted hybridizations.Many new best codes are obtained, and are reproducible from the information presented here. The reverse-complement constraint is handled by searching for an involution with 0 or 1 fixed points, as first done by Gaborit and King. Linear codes and additive codes over GF(4) and their cosets are considered, as well as shortenings of these codes. In the additive case, codes obtained from two different mappings from GF(4) to {A,C,G,T} are considered

    Mutually Uncorrelated Primers for DNA-Based Data Storage

    Full text link
    We introduce the notion of weakly mutually uncorrelated (WMU) sequences, motivated by applications in DNA-based data storage systems and for synchronization of communication devices. WMU sequences are characterized by the property that no sufficiently long suffix of one sequence is the prefix of the same or another sequence. WMU sequences used for primer design in DNA-based data storage systems are also required to be at large mutual Hamming distance from each other, have balanced compositions of symbols, and avoid primer-dimer byproducts. We derive bounds on the size of WMU and various constrained WMU codes and present a number of constructions for balanced, error-correcting, primer-dimer free WMU codes using Dyck paths, prefix-synchronized and cyclic codes.Comment: 14 pages, 3 figures, 1 Table. arXiv admin note: text overlap with arXiv:1601.0817

    Asymmetric Lee Distance Codes for DNA-Based Storage

    Full text link
    We consider a new family of codes, termed asymmetric Lee distance codes, that arise in the design and implementation of DNA-based storage systems and systems with parallel string transmission protocols. The codewords are defined over a quaternary alphabet, although the results carry over to other alphabet sizes; furthermore, symbol confusability is dictated by their underlying binary representation. Our contributions are two-fold. First, we demonstrate that the new distance represents a linear combination of the Lee and Hamming distance and derive upper bounds on the size of the codes under this metric based on linear programming techniques. Second, we propose a number of code constructions which imply lower bounds

    Bounds for DNA codes with constant GC-content

    Full text link
    We derive theoretical upper and lower bounds on the maximum size of DNA codes of length n with constant GC-content w and minimum Hamming distance d, both with and without the additional constraint that the minimum Hamming distance between any codeword and the reverse-complement of any codeword be at least d. We also explicitly construct codes that are larger than the best previously-published codes for many choices of the parameters n, d and w.Comment: 13 pages, no figures; a few references added and typos correcte

    Improved Lower Bounds for Constant GC-Content DNA Codes

    Full text link
    The design of large libraries of oligonucleotides having constant GC-content and satisfying Hamming distance constraints between oligonucleotides and their Watson-Crick complements is important in reducing hybridization errors in DNA computing, DNA microarray technologies, and molecular bar coding. Various techniques have been studied for the construction of such oligonucleotide libraries, ranging from algorithmic constructions via stochastic local search to theoretical constructions via coding theory. We introduce a new stochastic local search method which yields improvements up to more than one third of the benchmark lower bounds of Gaborit and King (2005) for n-mer oligonucleotide libraries when n <= 14. We also found several optimal libraries by computing maximum cliques on certain graphs.Comment: 4 page

    Efficient Two-Stage Group Testing Algorithms for Genetic Screening

    Full text link
    Efficient two-stage group testing algorithms that are particularly suited for rapid and less-expensive DNA library screening and other large scale biological group testing efforts are investigated in this paper. The main focus is on novel combinatorial constructions in order to minimize the number of individual tests at the second stage of a two-stage disjunctive testing procedure. Building on recent work by Levenshtein (2003) and Tonchev (2008), several new infinite classes of such combinatorial designs are presented.Comment: 14 pages; to appear in "Algorithmica". Part of this work has been presented at the ICALP 2011 Group Testing Workshop; arXiv:1106.368
    • …
    corecore