3 research outputs found

    Genomic Methods for Bacterial Infection Identification

    Get PDF
    Hospital-acquired infections (HAIs) have high mortality rates around the world and are a challenge to medical science due to rapid mutation rates in their pathogens. A new methodology is proposed to identify bacterial species causing HAIs based on sets of universal biomarkers for next-generation microarray designs (i.e., nxh chips), rather than a priori selections of biomarkers. This method allows arbitrary organisms to be classified based on readouts of their DNA sequences, including whole genomes. The underlying models are based on the biochemistry of DNA, unlike traditional edit-distance based alignments. Furthermore, the methodology is fairly robust to genetic mutations, which are likely to reduce accuracy. Standard machine learning methods (neural networks, self-organizing maps, and random forests) produce results to identify HAIs on nxh chips that are very competitive, if not superior, to current standards in the field. The potential feasibility of translating these techniques to a clinical test is also discussed

    Genomic Reconstruction of the Tree of Life

    Get PDF
    A new methodology is presented for molecular phylogenetic analysis addressing a fundamental problem in biology, name the reconstruction of the Tree of Life (TOL). Here, phylogenies are based on patterns of hybridization similarity in their DNA. Furthermore, phylogenies are based on a set of universal biomarkers (so-called nxh chips) chosen a priori, independently of the target group of organisms. Therefore, this methodology enables analyses of groups with biologically distant organisms, hence could be scaled to obtain a universal tree of life. Unlike conventional molecular methods, it produces a hypothesis in a single run, without optimizing across numerous hypotheses for consensus. Prototype hypotheses agree with the biological Ground Truth in over 70% of the relationships. Higher quality nxh chips are likely to produce better hypotheses, but more difficult to design

    A geometric approach to gibbs energy landscapes and optimal DNA codeword design

    No full text
    Finding a large set of single DNA strands that do not crosshybridize to themselves or to their complements (so-called domains in the language of chemical reaction networks (CRNs)) is an important problem in DNA computing, self-assembly, DNA memories and phylogenetic analyses because of their error correction and prevention properties. In prior work, we have provided a theoretical framework to analyze this problem and showed that Codeword Design is NP-complete using any single reasonable metric that approximates the Gibbs energy, thus practically excluding the possibility of finding any procedure to find maximal sets exactly and efficiently. In this framework, codeword design is reduced to finding large sets of strands maximally separated in DNA spaces and, therefore, the size of such sets depends on the geometry of these spaces. Here, we introduce a new general technique to embed them in Euclidean spaces in such a way that oligos with high/low hybridization affinity are mapped to neighboring/remote points in a geometric lattice, respectively. This embedding materializes long-held mataphors about codeword design in terms of sphere packing and leads to designs that are in some cases known to be provable nearly optimal for some oligo sizes. It also leads to upper and lower bounds on estimates of the size of optimal codes of size up to 32-mers, as well as to infinite families of DNA strand lengths, based on estimates of the kissing (or contact) number for sphere packings in Euclidean spaces. Conversely, we show how solutions to DNA codeword design obtained by experimental or other means can also provide solutions to difficult spherical packing geometric problems via this embedding. Finally, the reduction suggests an analytical tool to arrange the dynamics of strand displacement cascades in CRNs to effect the transformation through bounded Gibbs energy changes, and thus is potentially useful in compilers for wet tube implementation of biomolecular programs. © 2012 Springer-Verlag
    corecore