29 research outputs found

    A microbial detection array (MDA) for viral and bacterial detection

    Get PDF
    BACKGROUND: Identifying the bacteria and viruses present in a complex sample is useful in disease diagnostics, product safety, environmental characterization, and research. Array-based methods have proven utility to detect in a single assay at a reasonable cost any microbe from the thousands that have been sequenced. METHODS: We designed a pan-Microbial Detection Array (MDA) to detect all known viruses (including phages), bacteria and plasmids and developed a novel statistical analysis method to identify mixtures of organisms from complex samples hybridized to the array. The array has broader coverage of bacterial and viral targets and is based on more recent sequence data and more probes per target than other microbial detection/discovery arrays in the literature. Family-specific probes were selected for all sequenced viral and bacterial complete genomes, segments, and plasmids. Probes were designed to tolerate some sequence variation to enable detection of divergent species with homology to sequenced organisms, and to have no significant matches to the human genome sequence. RESULTS: In blinded testing on spiked samples with single or multiple viruses, the MDA was able to correctly identify species or strains. In clinical fecal, serum, and respiratory samples, the MDA was able to detect and characterize multiple viruses, phage, and bacteria in a sample to the family and species level, as confirmed by PCR. CONCLUSIONS: The MDA can be used to identify the suite of viruses and bacteria present in complex samples

    Draft versus finished sequence data for DNA and protein diagnostic signature development

    Get PDF
    Sequencing pathogen genomes is costly, demanding careful allocation of limited sequencing resources. We built a computational Sequencing Analysis Pipeline (SAP) to guide decisions regarding the amount of genomic sequencing necessary to develop high-quality diagnostic DNA and protein signatures. SAP uses simulations to estimate the number of target genomes and close phylogenetic relatives (near neighbors or NNs) to sequence. We use SAP to assess whether draft data are sufficient or finished sequencing is required using Marburg and variola virus sequences. Simulations indicate that intermediate to high-quality draft with error rates of 10(−3)–10(−5) (∼8× coverage) of target organisms is suitable for DNA signature prediction. Low-quality draft with error rates of ∼1% (3× to 6× coverage) of target isolates is inadequate for DNA signature prediction, although low-quality draft of NNs is sufficient, as long as the target genomes are of high quality. For protein signature prediction, sequencing errors in target genomes substantially reduce the detection of amino acid sequence conservation, even if the draft is of high quality. In summary, high-quality draft of target and low-quality draft of NNs appears to be a cost-effective investment for DNA signature prediction, but may lead to underestimation of predicted protein signatures

    DNA signatures for detecting genetic engineering in bacteria

    Get PDF
    Using newly designed computational tools we show that, despite substantial shared sequences between natural plasmids and artificial vector sequences, a robust set of DNA oligomers can be identified that can differentiate artificial vector sequences from all available background viral and bacterial genomes and natural plasmids. We predict that these tools can achieve very high sensitivity and specificity rates for detecting new unsequenced vectors in microarray-based bioassays. Such DNA signatures could be important in detecting genetically engineered bacteria in environmental samples

    Draft versus finished sequence data for DNA and protein diagnostic signature development

    Get PDF
    Abstract Sequencing pathogen genomes is costly, demanding careful allocation of limited sequencing resources. We built a computational Sequencing Analysis Pipeline (SAP) to guide decisions regarding the amount of genomic sequencing necessary to develop highquality diagnostic DNA and protein signatures. SAP uses simulations to estimate the number of target genomes and close phylogenetic relatives (near neighbors, or NNs) to sequence. We use SAP to assess whether draft data is sufficient or finished sequencing is required using Marburg and variola virus sequences. Simulations indicate that intermediate to high quality draft with error rates of 10 -3 -10 -5 (~8x coverage) of target organisms is suitable for DNA signature prediction. Low quality draft with error rates of ~1% (3x to 6x coverage) of target isolates is inadequate for DNA signature prediction, although low quality draft of NNs is sufficient, as long as the target genomes are of high quality. For protein signature prediction, sequencing errors in target genomes substantially reduce the detection of amino acid sequence conservation, even if the draft is of high quality. In summary, high quality draft of target and low quality draft of NNs appears to be a cost-effective investment for DNA signature prediction, but may lead to underestimation of predicted protein signatures. 3 Introduction Draft sequencing requires that the order of base pairs in cloned fragments of a genome be determined usually at least 4 times (4x depth of coverage) at each position for a minimum degree of draft accuracy. This information is assembled into contigs, or fragments of the genome that cannot be joined further due to lack of sequence information across gaps between the contigs. To generate high-quality draft, usually about 8x coverage is optimal (1). Finished sequence, without gaps or ambiguous base calls, usually requires 8x to 10x coverage, along with additional analyses, often manual, to orient the contigs relative to one another and to close the gaps between them in a process called finishing. In fact, it has been stated that "the defining distinction of draft sequencing is the avoidance of significant human intervention" (1), although there are computational tools that may also be capable of automated finishing in some circumstances (2). While some tabulate the cost differential between high quality draft versus finished sequences to be 3-to 4-fold, and the speed differential to be over 10-fold (1), others state that the cost differential is a more modest 1.3-to 1.5-fold (3). In either case, draft sequencing is cheaper and faster. Experts have debated whether finished sequencing is always necessary, considering the higher costs (1,3,4). Thus, here we set out to determine whether draft sequence data is adequate for the computational prediction of DNA and protein diagnostic signatures. By a "signature" we mean a short region of sequence that is sufficient to uniquely identify an organism down to the species level, without false negatives due to strain variation or false positives due to cross reaction with close phylogenetic relatives. In addition, for DNA signatures, we require that the signature be suitable for a TaqMan reaction (e.g. composed of two primers and a probe of the desired T m 's). Limited funds and facilities in which to sequence biothreat pathogens mean that decision makers must choose wisely which and how many organisms to sequence. Money and time saved as a result of draft rather than finished sequencing enables more target organisms, more isolates of the target, and more NN's of the target to be sequenced. However, if draft data does not facilitate the generation of high quality signatures for detection, the tradeoff of quantity over quality will not be worth it. We used the Sequencing Analysis Pipeline (SAP) (5,6) to compare the value of finished sequence, real draft sequence, and simulated draft sequence of different qualities for the computational prediction of DNA and protein signatures for pathogen detection/diagnostics. Marburg and variola viruses were used as model organisms for these analyses, due to the availability of multiple genomes for these organisms. We hope that variola may serve as a guide for making predictions about bacteria, in which the genomes are substantially larger, and thus the cost of sequencing is much higher than for viruses. Variola was selected as the best available surrogate for bacteria at the time we began these analyses because: 1) it is double-stranded DNA 2) it has a relatively low mutation rate, more like bacteria than like the RNA or shorter DNA viruses that have higher mutation rates and thus higher levels of variation 3) it is very long for a virus, albeit shorter than a bacterial genom

    Detection of ESKAPE bacterial pathogens at the point of care using isothermal DNA-based assays in a portable degas-actuated microfluidic diagnostic assay platform

    Get PDF
    An estimated 1.5 billion microbial infections occur globally each year and result in ~4.6 million deaths. A technology gap associated with commercially available diagnostic tests in remote and underdeveloped regions prevents timely pathogen identification for effective antibiotic chemotherapies for infected patients. The result is a trial-and-error approach that is limited in effectiveness, increases risk for patients while contributing to antimicrobial drug resistance, and reduces the lifetime of antibiotics. This paper addresses this important diagnostic technology gap by describing a low-cost, portable, rapid, and easy-to-use microfluidic cartridgebased system for detecting the ESKAPE (Enterococcus faecium, Staphylococcus aureus, Klebsiella pneumoniae, Acinetobacter baumannii, Pseudomonas aeruginosa, and Enterobacter spp.) bacterial pathogens that are most commonly associated with antibiotic resistance. The point-of-care molecular diagnostic system consists of a vacuumdegassed microfluidic cartridge preloaded with lyophilized recombinase polymerase amplification (RPA) assays and a small portable battery-powered electronic incubator/ reader. The isothermal RPA assays detect the targeted ESKAPE pathogens with high sensitivity (e.g., a limit of detection of ~10 nucleic acid molecules) that is comparable to that of current PCR-based assays, and they offer advantages in power consumption, engineering, and robustness, which are three critical elements required for the point-of-care setting

    Cynomolgus Macaque as an Animal Model for Severe Acute Respiratory Syndrome

    Get PDF
    BACKGROUND: The emergence of severe acute respiratory syndrome (SARS) in 2002 and 2003 affected global health and caused major economic disruption. Adequate animal models are required to study the underlying pathogenesis of SARS-associated coronavirus (SARS-CoV) infection and to develop effective vaccines and therapeutics. We report the first findings of measurable clinical disease in nonhuman primates (NHPs) infected with SARS-CoV. METHODS AND FINDINGS: In order to characterize clinically relevant parameters of SARS-CoV infection in NHPs, we infected cynomolgus macaques with SARS-CoV in three groups: Group I was infected in the nares and bronchus, group II in the nares and conjunctiva, and group III intravenously. Nonhuman primates in groups I and II developed mild to moderate symptomatic illness. All NHPs demonstrated evidence of viral replication and developed neutralizing antibodies. Chest radiographs from several animals in groups I and II revealed unifocal or multifocal pneumonia that peaked between days 8 and 10 postinfection. Clinical laboratory tests were not significantly changed. Overall, inoculation by a mucosal route produced more prominent disease than did intravenous inoculation. Half of the group I animals were infected with a recombinant infectious clone SARS-CoV derived from the SARS-CoV Urbani strain. This infectious clone produced disease indistinguishable from wild-type Urbani strain. CONCLUSIONS: SARS-CoV infection of cynomolgus macaques did not reproduce the severe illness seen in the majority of adult human cases of SARS; however, our results suggest similarities to the milder syndrome of SARS-CoV infection characteristically seen in young children

    Finishing the euchromatic sequence of the human genome

    Get PDF
    The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers ∼99% of the euchromatic genome and is accurate to an error rate of ∼1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human enome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead
    corecore