Search CORE

11,783 research outputs found

Linear constructions for DNA codes

Author: Gaborit Philippe
King Oliver D.
Publication venue: Elsevier B.V.
Publication date: 01/01/2005
Field of study

AbstractIn this paper we translate in terms of coding theory constraints that are used in designing DNA codes for use in DNA computing or as bar-codes in chemical libraries. We propose new constructions for DNA codes satisfying either a reverse-complement constraint, a GC-content constraint, or both, that are derived from additive and linear codes over four-letter alphabets. We focus in particular on codes over GF(4), and we construct new DNA codes that are in many cases better (sometimes far better) than previously known codes. We provide updated tables up to length 20 that include these codes as well as new codes constructed using a combination of lexicographic techniques and stochastic search

Elsevier - Publisher Connector

HAL-UNILIM

Linear and nonlinear constructions of DNA codes with Hamming distance d, constant GC-content and a reverse-complement constraint

Author: Aboluion Niema
Perkins Stephanie
Smith Derek H.
Publication venue: Elsevier B.V.
Publication date
Field of study

AbstractIn a previous paper, the authors used cyclic and extended cyclic constructions to obtain codes over an alphabet {A,C,G,T} satisfying a Hamming distance constraint and a GC-content constraint. These codes are applicable to the design of synthetic DNA strands used in DNA microarrays, as DNA tags in chemical libraries and in DNA computing. The GC-content constraint specifies that a fixed number of positions are G or C in each codeword, which ensures uniform melting temperatures. The Hamming distance constraint is a step towards avoiding unwanted hybridizations. This approach extended the pioneering work of Gaborit and King. In the current paper, another constraint known as a reverse-complement constraint is added to further prevent unwanted hybridizations.Many new best codes are obtained, and are reproducible from the information presented here. The reverse-complement constraint is handled by searching for an involution with 0 or 1 fixed points, as first done by Gaborit and King. Linear codes and additive codes over GF(4) and their cosets are considered, as well as shortenings of these codes. In the additive case, codes obtained from two different mappings from GF(4) to {A,C,G,T} are considered

Elsevier - Publisher Connector

Mutually Uncorrelated Primers for DNA-Based Data Storage

Author: Gabrys Ryan
Kiah Han Mao
Milenkovic Olgica
Yazdi S. M. Hossein Tabatabaei
Publication venue
Publication date: 13/09/2017
Field of study

We introduce the notion of weakly mutually uncorrelated (WMU) sequences, motivated by applications in DNA-based data storage systems and for synchronization of communication devices. WMU sequences are characterized by the property that no sufficiently long suffix of one sequence is the prefix of the same or another sequence. WMU sequences used for primer design in DNA-based data storage systems are also required to be at large mutual Hamming distance from each other, have balanced compositions of symbols, and avoid primer-dimer byproducts. We derive bounds on the size of WMU and various constrained WMU codes and present a number of constructions for balanced, error-correcting, primer-dimer free WMU codes using Dyck paths, prefix-synchronized and cyclic codes.Comment: 14 pages, 3 figures, 1 Table. arXiv admin note: text overlap with arXiv:1601.0817

arXiv.org e-Print Archive

DR-NTU (Digital Repository of NTU)

Asymmetric Lee Distance Codes for DNA-Based Storage

Author: Gabrys Ryan
Kiah Han Mao
Milenkovic Olgica
Publication venue
Publication date: 14/12/2016
Field of study

We consider a new family of codes, termed asymmetric Lee distance codes, that arise in the design and implementation of DNA-based storage systems and systems with parallel string transmission protocols. The codewords are defined over a quaternary alphabet, although the results carry over to other alphabet sizes; furthermore, symbol confusability is dictated by their underlying binary representation. Our contributions are two-fold. First, we demonstrate that the new distance represents a linear combination of the Lee and Hamming distance and derive upper bounds on the size of the codes under this metric based on linear programming techniques. Second, we propose a number of code constructions which imply lower bounds

arXiv.org e-Print Archive

CiteSeerX

Bounds for DNA codes with constant GC-content

Author: King Oliver D.
Publication venue
Publication date: 01/01/2003
Field of study

We derive theoretical upper and lower bounds on the maximum size of DNA codes of length n with constant GC-content w and minimum Hamming distance d, both with and without the additional constraint that the minimum Hamming distance between any codeword and the reverse-complement of any codeword be at least d. We also explicitly construct codes that are larger than the best previously-published codes for many choices of the parameters n, d and w.Comment: 13 pages, no figures; a few references added and typos correcte

arXiv.org e-Print Archive

CiteSeerX

Improved Lower Bounds for Constant GC-Content DNA Codes

Author: Chee Yeow Meng
Ling San
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2008
Field of study

The design of large libraries of oligonucleotides having constant GC-content and satisfying Hamming distance constraints between oligonucleotides and their Watson-Crick complements is important in reducing hybridization errors in DNA computing, DNA microarray technologies, and molecular bar coding. Various techniques have been studied for the construction of such oligonucleotide libraries, ranging from algorithmic constructions via stochastic local search to theoretical constructions via coding theory. We introduce a new stochastic local search method which yields improvements up to more than one third of the benchmark lower bounds of Gaborit and King (2005) for n-mer oligonucleotide libraries when n <= 14. We also found several optimal libraries by computing maximum cliques on certain graphs.Comment: 4 page

arXiv.org e-Print Archive

CiteSeerX

DR-NTU (Digital Repository of NTU)

Efficient Two-Stage Group Testing Algorithms for Genetic Screening

Author: Huber Michael
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Efficient two-stage group testing algorithms that are particularly suited for rapid and less-expensive DNA library screening and other large scale biological group testing efforts are investigated in this paper. The main focus is on novel combinatorial constructions in order to minimize the number of individual tests at the second stage of a two-stage disjunctive testing procedure. Building on recent work by Levenshtein (2003) and Tonchev (2008), several new infinite classes of such combinatorial designs are presented.Comment: 14 pages; to appear in "Algorithmica". Part of this work has been presented at the ICALP 2011 Group Testing Workshop; arXiv:1106.368

arXiv.org e-Print Archive

Publikationsserver der Universität Tübingen