7 research outputs found

    Finishing the euchromatic sequence of the human genome

    Get PDF
    The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers ∼99% of the euchromatic genome and is accurate to an error rate of ∼1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human enome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead

    by DNA

    No full text
    Sequence-dependent gating of an ion channe

    Highly Accurate Classification of Watson-Crick Basepairs on Termini of Single DNA Molecules

    Get PDF
    We introduce a computational method for classification of individual DNA molecules measured by an α-hemolysin channel detector. We show classification with better than 99% accuracy for DNA hairpin molecules that differ only in their terminal Watson-Crick basepairs. Signal classification was done in silico to establish performance metrics (i.e., where train and test data were of known type, via single-species data files). It was then performed in solution to assay real mixtures of DNA hairpins. Hidden Markov Models (HMMs) were used with Expectation/Maximization for denoising and for associating a feature vector with the ionic current blockade of the DNA molecule. Support Vector Machines (SVMs) were used as discriminators, and were the focus of off-line training. A multiclass SVM architecture was designed to place less discriminatory load on weaker discriminators, and novel SVM kernels were used to boost discrimination strength. The tuning on HMMs and SVMs enabled biophysical analysis of the captured molecule states and state transitions; structure revealed in the biophysical analysis was used for better feature selection

    Discrimination among individual Watson–Crick base pairs at the termini of single DNA hairpin molecules

    No full text
    Nanoscale α-hemolysin pores can be used to analyze individual DNA or RNA molecules. Serial examination of hundreds to thousands of molecules per minute is possible using ionic current impedance as the measured property. In a recent report, we showed that a nanopore device coupled with machine learning algorithms could automatically discriminate among the four combinations of Watson–Crick base pairs and their orientations at the ends of individual DNA hairpin molecules. Here we use kinetic analysis to demonstrate that ionic current signatures caused by these hairpin molecules depend on the number of hydrogen bonds within the terminal base pair, stacking between the terminal base pair and its nearest neighbor, and 5′ versus 3′ orientation of the terminal bases independent of their nearest neighbors. This report constitutes evidence that single Watson–Crick base pairs can be identified within individual unmodified DNA hairpin molecules based on their dynamic behavior in a nanoscale pore
    corecore