53 research outputs found

    Simulate_PCR for amplicon prediction and annotation from multiplex, degenerate primers and probes

    Get PDF
    BACKGROUND: Pairing up primers to amplify desired targets and avoid undesired cross reactions can be a combinatorial challenge. Effective prediction of specificity and inclusivity from multiplexed primers and TaqMan®/Luminex® probes is a critical step in PCR design. RESULTS: Code is described to identify all primer and probe combinations from a list of unpaired, unordered candidates that should produce a product. It predicts and extracts all amplicon sequences in a large sequence database from a list of primers and probes, allowing degenerate bases and user-specified levels of primer-target mismatch tolerance. Amplicons hit by TaqMan®/Luminex® probes are indicated, and products may be annotated with gene information from NCBI. Fragment length distributions are calculated to predict electrophoretic gel banding patterns. CONCLUSIONS: Simulate_PCR is the only freely available software that can be run from the command line for high throughput applications which can calculate all products from large lists of primers and probes compared to a large sequence database such as nt. It requires no prior knowledge of how primers should be paired. Degenerate bases are allowed and entire amplicon sequences are extracted and annotated with gene information. Examples are provided for sets of TaqMan®/Luminex® PCR signatures predicted to amplify all HIV-1 genomes, all Coronaviridae genomes, and a group of antibiotic resistance genes. The software is a command line perl script freely available as open source

    A microbial detection array (MDA) for viral and bacterial detection

    Get PDF
    BACKGROUND: Identifying the bacteria and viruses present in a complex sample is useful in disease diagnostics, product safety, environmental characterization, and research. Array-based methods have proven utility to detect in a single assay at a reasonable cost any microbe from the thousands that have been sequenced. METHODS: We designed a pan-Microbial Detection Array (MDA) to detect all known viruses (including phages), bacteria and plasmids and developed a novel statistical analysis method to identify mixtures of organisms from complex samples hybridized to the array. The array has broader coverage of bacterial and viral targets and is based on more recent sequence data and more probes per target than other microbial detection/discovery arrays in the literature. Family-specific probes were selected for all sequenced viral and bacterial complete genomes, segments, and plasmids. Probes were designed to tolerate some sequence variation to enable detection of divergent species with homology to sequenced organisms, and to have no significant matches to the human genome sequence. RESULTS: In blinded testing on spiked samples with single or multiple viruses, the MDA was able to correctly identify species or strains. In clinical fecal, serum, and respiratory samples, the MDA was able to detect and characterize multiple viruses, phage, and bacteria in a sample to the family and species level, as confirmed by PCR. CONCLUSIONS: The MDA can be used to identify the suite of viruses and bacteria present in complex samples

    Draft versus finished sequence data for DNA and protein diagnostic signature development

    Get PDF
    Sequencing pathogen genomes is costly, demanding careful allocation of limited sequencing resources. We built a computational Sequencing Analysis Pipeline (SAP) to guide decisions regarding the amount of genomic sequencing necessary to develop high-quality diagnostic DNA and protein signatures. SAP uses simulations to estimate the number of target genomes and close phylogenetic relatives (near neighbors or NNs) to sequence. We use SAP to assess whether draft data are sufficient or finished sequencing is required using Marburg and variola virus sequences. Simulations indicate that intermediate to high-quality draft with error rates of 10(−3)–10(−5) (∼8× coverage) of target organisms is suitable for DNA signature prediction. Low-quality draft with error rates of ∼1% (3× to 6× coverage) of target isolates is inadequate for DNA signature prediction, although low-quality draft of NNs is sufficient, as long as the target genomes are of high quality. For protein signature prediction, sequencing errors in target genomes substantially reduce the detection of amino acid sequence conservation, even if the draft is of high quality. In summary, high-quality draft of target and low-quality draft of NNs appears to be a cost-effective investment for DNA signature prediction, but may lead to underestimation of predicted protein signatures

    DNA signatures for detecting genetic engineering in bacteria

    Get PDF
    Using newly designed computational tools we show that, despite substantial shared sequences between natural plasmids and artificial vector sequences, a robust set of DNA oligomers can be identified that can differentiate artificial vector sequences from all available background viral and bacterial genomes and natural plasmids. We predict that these tools can achieve very high sensitivity and specificity rates for detecting new unsequenced vectors in microarray-based bioassays. Such DNA signatures could be important in detecting genetically engineered bacteria in environmental samples

    Detection of ESKAPE bacterial pathogens at the point of care using isothermal DNA-based assays in a portable degas-actuated microfluidic diagnostic assay platform

    Get PDF
    An estimated 1.5 billion microbial infections occur globally each year and result in ~4.6 million deaths. A technology gap associated with commercially available diagnostic tests in remote and underdeveloped regions prevents timely pathogen identification for effective antibiotic chemotherapies for infected patients. The result is a trial-and-error approach that is limited in effectiveness, increases risk for patients while contributing to antimicrobial drug resistance, and reduces the lifetime of antibiotics. This paper addresses this important diagnostic technology gap by describing a low-cost, portable, rapid, and easy-to-use microfluidic cartridgebased system for detecting the ESKAPE (Enterococcus faecium, Staphylococcus aureus, Klebsiella pneumoniae, Acinetobacter baumannii, Pseudomonas aeruginosa, and Enterobacter spp.) bacterial pathogens that are most commonly associated with antibiotic resistance. The point-of-care molecular diagnostic system consists of a vacuumdegassed microfluidic cartridge preloaded with lyophilized recombinase polymerase amplification (RPA) assays and a small portable battery-powered electronic incubator/ reader. The isothermal RPA assays detect the targeted ESKAPE pathogens with high sensitivity (e.g., a limit of detection of ~10 nucleic acid molecules) that is comparable to that of current PCR-based assays, and they offer advantages in power consumption, engineering, and robustness, which are three critical elements required for the point-of-care setting

    Draft versus finished sequence data for DNA and protein diagnostic signature development

    Get PDF
    Abstract Sequencing pathogen genomes is costly, demanding careful allocation of limited sequencing resources. We built a computational Sequencing Analysis Pipeline (SAP) to guide decisions regarding the amount of genomic sequencing necessary to develop highquality diagnostic DNA and protein signatures. SAP uses simulations to estimate the number of target genomes and close phylogenetic relatives (near neighbors, or NNs) to sequence. We use SAP to assess whether draft data is sufficient or finished sequencing is required using Marburg and variola virus sequences. Simulations indicate that intermediate to high quality draft with error rates of 10 -3 -10 -5 (~8x coverage) of target organisms is suitable for DNA signature prediction. Low quality draft with error rates of ~1% (3x to 6x coverage) of target isolates is inadequate for DNA signature prediction, although low quality draft of NNs is sufficient, as long as the target genomes are of high quality. For protein signature prediction, sequencing errors in target genomes substantially reduce the detection of amino acid sequence conservation, even if the draft is of high quality. In summary, high quality draft of target and low quality draft of NNs appears to be a cost-effective investment for DNA signature prediction, but may lead to underestimation of predicted protein signatures. 3 Introduction Draft sequencing requires that the order of base pairs in cloned fragments of a genome be determined usually at least 4 times (4x depth of coverage) at each position for a minimum degree of draft accuracy. This information is assembled into contigs, or fragments of the genome that cannot be joined further due to lack of sequence information across gaps between the contigs. To generate high-quality draft, usually about 8x coverage is optimal (1). Finished sequence, without gaps or ambiguous base calls, usually requires 8x to 10x coverage, along with additional analyses, often manual, to orient the contigs relative to one another and to close the gaps between them in a process called finishing. In fact, it has been stated that "the defining distinction of draft sequencing is the avoidance of significant human intervention" (1), although there are computational tools that may also be capable of automated finishing in some circumstances (2). While some tabulate the cost differential between high quality draft versus finished sequences to be 3-to 4-fold, and the speed differential to be over 10-fold (1), others state that the cost differential is a more modest 1.3-to 1.5-fold (3). In either case, draft sequencing is cheaper and faster. Experts have debated whether finished sequencing is always necessary, considering the higher costs (1,3,4). Thus, here we set out to determine whether draft sequence data is adequate for the computational prediction of DNA and protein diagnostic signatures. By a "signature" we mean a short region of sequence that is sufficient to uniquely identify an organism down to the species level, without false negatives due to strain variation or false positives due to cross reaction with close phylogenetic relatives. In addition, for DNA signatures, we require that the signature be suitable for a TaqMan reaction (e.g. composed of two primers and a probe of the desired T m 's). Limited funds and facilities in which to sequence biothreat pathogens mean that decision makers must choose wisely which and how many organisms to sequence. Money and time saved as a result of draft rather than finished sequencing enables more target organisms, more isolates of the target, and more NN's of the target to be sequenced. However, if draft data does not facilitate the generation of high quality signatures for detection, the tradeoff of quantity over quality will not be worth it. We used the Sequencing Analysis Pipeline (SAP) (5,6) to compare the value of finished sequence, real draft sequence, and simulated draft sequence of different qualities for the computational prediction of DNA and protein signatures for pathogen detection/diagnostics. Marburg and variola viruses were used as model organisms for these analyses, due to the availability of multiple genomes for these organisms. We hope that variola may serve as a guide for making predictions about bacteria, in which the genomes are substantially larger, and thus the cost of sequencing is much higher than for viruses. Variola was selected as the best available surrogate for bacteria at the time we began these analyses because: 1) it is double-stranded DNA 2) it has a relatively low mutation rate, more like bacteria than like the RNA or shorter DNA viruses that have higher mutation rates and thus higher levels of variation 3) it is very long for a virus, albeit shorter than a bacterial genom

    A Functional Gene Array for Detection of Bacterial Virulence Elements

    Get PDF
    Emerging known and unknown pathogens create profound threats to public health. Platforms for rapid detection and characterization of microbial agents are critically needed to prevent and respond to disease outbreaks. Available detection technologies cannot provide broad functional information about known or novel organisms. As a step toward developing such a system, we have produced and tested a series of high-density functional gene arrays to detect elements of virulence and antibiotic resistance mechanisms. Our first generation array targets genes from Escherichia coli strains K12 and CFT073, Enterococcus faecalis and Staphylococcus aureus. We determined optimal probe design parameters for gene family detection and discrimination. When tested with organisms at varying phylogenetic distances from the four target strains, the array detected orthologs for the majority of targeted gene families present in bacteria belonging to the same taxonomic family. In combination with whole-genome amplification, the array detects femtogram concentrations of purified DNA, either spiked in to an aerosol sample background, or in combinations from one or more of the four target organisms. This is the first report of a high density NimbleGen microarray system targeting microbial antibiotic resistance and virulence mechanisms. By targeting virulence gene families as well as genes unique to specific biothreat agents, these arrays will provide important data about the pathogenic potential and drug resistance profiles of unknown organisms in environmental samples
    corecore