831 research outputs found

    DNA Molecule Classification Using Feature Primitives

    Get PDF
    BACKGROUND: We present a novel strategy for classification of DNA molecules using measurements from an alpha-Hemolysin channel detector. The proposed approach provides excellent classification performance for five different DNA hairpins that differ in only one base-pair. For multi-class DNA classification problems, practitioners usually adopt approaches that use decision trees consisting of binary classifiers. Finding the best tree topology requires exploring all possible tree topologies and is computationally prohibitive. We propose a computational framework based on feature primitives that eliminates the need of a decision tree of binary classifiers. In the first phase, we generate a pool of weak features from nanopore blockade current measurements by using HMM analysis, principal component analysis and various wavelet filters. In the next phase, feature selection is performed using AdaBoost. AdaBoost provides an ensemble of weak learners of various types learned from feature primitives. RESULTS AND CONCLUSION: We show that our technique, despite its inherent simplicity, provides a performance comparable to recent multi-class DNA molecule classification results. Unlike the approach presented by Winters-Hilt et al., where weaker data is dropped to obtain better classification, the proposed approach provides comparable classification accuracy without any need for rejection of weak data. A weakness of this approach, on the other hand, is the very "hands-on" tuning and feature selection that is required to obtain good generalization. Simply put, this method obtains a more informed set of features and provides better results for that reason. The strength of this approach appears to be in its ability to identify strong features, an area where further results are actively being sought

    Analysis of Nanopore Detector Measurements using Machine Learning Methods, with Application to Single-Molecule Kinetics

    Get PDF
    At its core, a nanopore detector has a nanometer-scale biological membrane across which a voltage is applied. The voltage draws a DNA molecule into an á-hemolysin channel in the membrane. Consequently, a distinctive channel current blockade signal is created as the molecule flexes and interacts with the channel. This flexing of the molecule is characterized by different blockade levels in the channel current signal. Previous experiments have shown that a nanopore detector is sufficiently sensitive such that nearly identical DNA molecules were classified successfully using machine learning techniques such as Hidden Markov Models and Support Vector Machines in a channel current based signal analysis platform [4-9]. In this paper, methods for improving feature extraction are presented to improve both classification and to provide biologists and chemists with a better understanding of the physical properties of a given molecule

    Analysis of nanopore detector measurements using Machine-Learning methods, with application to single-molecule kinetic analysis

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>A nanopore detector has a nanometer-scale trans-membrane channel across which a potential difference is established, resulting in an ionic current through the channel in the pA-nA range. A distinctive channel current blockade signal is created as individually "captured" DNA molecules interact with the channel and modulate the channel's ionic current. The nanopore detector is sensitive enough that nearly identical DNA molecules can be classified with very high accuracy using machine learning techniques such as Hidden Markov Models (HMMs) and Support Vector Machines (SVMs).</p> <p>Results</p> <p>A non-standard implementation of an HMM, emission inversion, is used for improved classification. Additional features are considered for the feature vector employed by the SVM for classification as well: The addition of a single feature representing spike density is shown to notably improve classification results. Another, much larger, feature set expansion was studied (2500 additional features instead of 1), deriving from including all the HMM's transition probabilities. The expanded features can introduce redundant, noisy information (as well as diagnostic information) into the current feature set, and thus degrade classification performance. A hybrid Adaptive Boosting approach was used for feature selection to alleviate this problem.</p> <p>Conclusion</p> <p>The methods shown here, for more informed feature extraction, improve both classification and provide biologists and chemists with tools for obtaining a better understanding of the kinetic properties of molecules of interest.</p

    The NTD Nanoscope: potential applications and implementations

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Nanopore transduction detection (NTD) offers prospects for a number of highly sensitive and discriminative applications, including: (i) single nucleotide polymorphism (SNP) detection; (ii) targeted DNA re-sequencing; (iii) protein isoform assaying; and (iv) biosensing via antibody or aptamer coupled molecules. Nanopore event transduction involves single-molecule biophysics, engineered information flows, and nanopore cheminformatics. The NTD Nanoscope has seen limited use in the scientific community, however, due to lack of information about potential applications, and lack of availability for the device itself. Meta Logos Inc. is developing both pre-packaged device platforms and component-level (unassembled) kit platforms (the latter described here). In both cases a lipid bi-layer workstation is first established, then augmentations and operational protocols are provided to have a nanopore transduction detector. In this paper we provide an overview of the NTD Nanoscope applications and implementations. The NTD Nanoscope Kit, in particular, is a component-level reproduction of the standard NTD device used in previous research papers.</p> <p>Results</p> <p>The NTD Nanoscope method is shown to functionalize a single nanopore with a channel current modulator that is designed to transduce events, such as binding to a specific target. To expedite set-up in new lab settings, the calibration and troubleshooting for the NTD Nanoscope kit components and signal processing software, the NTD Nanoscope Kit, is designed to include a set of test buffers and control molecules based on experiments described in previous NTD papers (the model systems briefly described in what follows). The description of the Server-interfacing for advanced signal processing support is also briefly mentioned.</p> <p>Conclusions</p> <p>SNP assaying, SNP discovery, DNA sequencing and RNA-seq methods are typically limited by the accuracy of the error rate of the enzymes involved, such as methods involving the polymerase chain reaction (PCR) enzyme. The NTD Nanoscope offers a means to obtain higher accuracy as it is a single-molecule method that does not inherently involve use of enzymes, using a functionalized nanopore instead.</p

    Computation of protein geometry and its applications: Packing and function prediction

    Full text link
    This chapter discusses geometric models of biomolecules and geometric constructs, including the union of ball model, the weigthed Voronoi diagram, the weighted Delaunay triangulation, and the alpha shapes. These geometric constructs enable fast and analytical computaton of shapes of biomoleculres (including features such as voids and pockets) and metric properties (such as area and volume). The algorithms of Delaunay triangulation, computation of voids and pockets, as well volume/area computation are also described. In addition, applications in packing analysis of protein structures and protein function prediction are also discussed.Comment: 32 pages, 9 figure

    Methods for Interpreting and Understanding Deep Neural Networks

    Full text link
    This paper provides an entry point to the problem of interpreting a deep neural network model and explaining its predictions. It is based on a tutorial given at ICASSP 2017. It introduces some recently proposed techniques of interpretation, along with theory, tricks and recommendations, to make most efficient use of these techniques on real data. It also discusses a number of practical applications.Comment: 14 pages, 10 figure

    Immune-Mediated Drug Induced Liver Injury: A Multidisciplinary Approach

    Get PDF
    This thesis presents an approach to expose relationships between immune mediated drug induced liver injury (IMDILI) and the three-dimensional structural features of toxic drug molecules and their metabolites. The series of analyses test the hypothesis that drugs which produce similar patterns of toxicity interact with targets within common toxicological pathways and that activation of the underlying mechanisms depends on structural similarity among toxic molecules. Spontaneous adverse drug reaction (ADR) reports were used to identify cases of IMDILI. Network map tools were used to compare the known and predicted protein interactions with each of the probe drugs to explore the interactions that are common between the drugs. The IMDILI probe set was then used to develop a pharmacophore model which became the starting point for identifying potential toxicity targets for IMDILI. Pharmacophore screening results demonstrated similarities between the probe IMDILI set of drugs and Toll-Like Receptor 7 (TLR7) agonists, suggesting TLR7 as a potential toxicity target. This thesis highlights the potential for multidisciplinary approaches in the study of complex diseases. Such approaches are particularly helpful for rare diseases where little knowledge is available, and may provide key insights into mechanisms of toxicity that cannot be gleaned from a single disciplinary study
    • …
    corecore