3,080 research outputs found

    Efficient oligonucleotide probe selection for pan-genomic tiling arrays

    Get PDF
    Background: Array comparative genomic hybridization is a fast and cost-effective method for detecting, genotyping, and comparing the genomic sequence of unknown bacterial isolates. This method, as with all microarray applications, requires adequate coverage of probes targeting the regions of interest. An unbiased tiling of probes across the entire length of the genome is the most flexible design approach. However, such a whole-genome tiling requires that the genome sequence is known in advance. For the accurate analysis of uncharacterized bacteria, an array must query a fully representative set of sequences from the species' pan-genome. Prior microarrays have included only a single strain per array or the conserved sequences of gene families. These arrays omit potentially important genes and sequence variants from the pan-genome. Results: This paper presents a new probe selection algorithm (PanArray) that can tile multiple whole genomes using a minimal number of probes. Unlike arrays built on clustered gene families, PanArray uses an unbiased, probe-centric approach that does not rely on annotations, gene clustering, or multi-alignments. Instead, probes are evenly tiled across all sequences of the pangenome at a consistent level of coverage. To minimize the required number of probes, probes conserved across multiple strains in the pan-genome are selected first, and additional probes are used only where necessary to span polymorphic regions of the genome. The viability of the algorithm is demonstrated by array designs for seven different bacterial pan-genomes and, in particular, the design of a 385,000 probe array that fully tiles the genomes of 20 different Listeria monocytogenes strains with overlapping probes at greater than twofold coverage. Conclusion: PanArray is an oligonucleotide probe selection algorithm for tiling multiple genome sequences using a minimal number of probes. It is capable of fully tiling all genomes of a species on a single microarray chip. These unique pan-genome tiling arrays provide maximum flexibility for the analysis of both known and uncharacterized strains.https://doi.org/10.1186/1471-2105-10-29

    In silico microarray probe design for diagnosis of multiple pathogens

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>With multiple strains of various pathogens being sequenced, it is necessary to develop high-throughput methods that can simultaneously process multiple bacterial or viral genomes to find common fingerprints as well as fingerprints that are unique to each individual genome. We present algorithmic enhancements to an existing single-genome pipeline that allows for efficient design of microarray probes common to groups of target genomes. The enhanced pipeline takes advantage of the similarities in the input genomes to narrow the search to short, nonredundant regions of the target genomes and, thereby, significantly reduces the computation time. The pipeline also computes a three-state hybridization matrix, which gives the expected hybridization of each probe with each target.</p> <p>Results</p> <p>Design of microarray probes for eight pathogenic <it>Burkholderia </it>genomes shows that the multiple-genome pipeline is nearly four-times faster than the single-genome pipeline for this application. The probes designed for these eight genomes were experimentally tested with one non-target and three target genomes. Hybridization experiments show that less than 10% of the designed probes cross hybridize with non-targets. Also, more than 65% of the probes designed to identify all <it>Burkholderia mallei </it>and <it>B. pseudomallei </it>strains successfully hybridize with a <it>B. pseudomallei </it>strain not used for probe design.</p> <p>Conclusion</p> <p>The savings in runtime suggest that the enhanced pipeline can be used to design fingerprints for tens or even hundreds of related genomes in a single run. Hybridization results with an unsequenced <it>B. pseudomallei </it>strain indicate that the designed probes might be useful in identifying unsequenced strains of <it>B. mallei </it>and <it>B. pseudomallei</it>.</p

    Whole-genome sequence analysis for pathogen detection and diagnostics

    Get PDF
    This dissertation focuses on computational methods for improving the accuracy of commonly used nucleic acid tests for pathogen detection and diagnostics. Three specific biomolecular techniques are addressed: polymerase chain reaction, microarray comparative genomic hybridization, and whole-genome sequencing. These methods are potentially the future of diagnostics, but each requires sophisticated computational design or analysis to operate effectively. This dissertation presents novel computational methods that unlock the potential of these diagnostics by efficiently analyzing whole-genome DNA sequences. Improvements in the accuracy and resolution of each of these diagnostic tests promises more effective diagnosis of illness and rapid detection of pathogens in the environment. For designing real-time detection assays, an efficient data structure and search algorithm are presented to identify the most distinguishing sequences of a pathogen that are absent from all other sequenced genomes. Results are presented that show these "signature" sequences can be used to detect pathogens in complex samples and differentiate them from their non-pathogenic, phylogenetic near neighbors. For microarray, novel pan-genomic design and analysis methods are presented for the characterization of unknown microbial isolates. To demonstrate the effectiveness of these methods, pan-genomic arrays are applied to the study of multiple strains of the foodborne pathogen, Listeria monocytogenes, revealing new insights into the diversity and evolution of the species. Finally, multiple methods are presented for the validation of whole-genome sequence assemblies, which are capable of identifying assembly errors in even finished genomes. These validated assemblies provide the ultimate nucleic acid diagnostic, revealing the entire sequence of a genome

    Two new rapid SNP-typing methods for classifying Mycobacterium tuberculosis complex into the main phylogenetic lineages

    Get PDF
    There is increasing evidence that strain variation in Mycobacterium tuberculosis complex (MTBC) might influence the outcome of tuberculosis infection and disease. To assess genotype-phenotype associations, phylogenetically robust molecular markers and appropriate genotyping tools are required. Most current genotyping methods for MTBC are based on mobile or repetitive DNA elements. Because these elements are prone to convergent evolution, the corresponding genotyping techniques are suboptimal for phylogenetic studies and strain classification. By contrast, single nucleotide polymorphisms (SNP) are ideal markers for classifying MTBC into phylogenetic lineages, as they exhibit very low degrees of homoplasy. In this study, we developed two complementary SNP-based genotyping methods to classify strains into the six main human-associated lineages of MTBC, the 'Beijing' sublineage, and the clade comprising Mycobacterium bovis and Mycobacterium caprae. Phylogenetically informative SNPs were obtained from 22 MTBC whole-genome sequences. The first assay, referred to as MOL-PCR, is a ligation-dependent PCR with signal detection by fluorescent microspheres and a Luminex flow cytometer, which simultaneously interrogates eight SNPs. The second assay is based on six individual TaqMan real-time PCR assays for singleplex SNP-typing. We compared MOL-PCR and TaqMan results in two panels of clinical MTBC isolates. Both methods agreed fully when assigning 36 well-characterized strains into the main phylogenetic lineages. The sensitivity in allele-calling was 98.6% and 98.8% for MOL-PCR and TaqMan, respectively. Typing of an additional panel of 78 unknown clinical isolates revealed 99.2% and 100% sensitivity in allele-calling, respectively, and 100% agreement in lineage assignment between both methods. While MOL-PCR and TaqMan are both highly sensitive and specific, MOL-PCR is ideal for classification of isolates with no previous information, whereas TaqMan is faster for confirmation. Furthermore, both methods are rapid, flexible and comparably inexpensive

    Genomic insights into members of the candidate phylum Hyd24-12 common in mesophilic anaerobic digesters

    Get PDF
    Members of the candidate phylum Hyd24-12 are globally distributed, but no genomic information or knowledge about their morphology, physiology or ecology is available. In this study, members of the Hyd24-12 lineage were shown to be present and abundant in full-scale mesophilic anaerobic digesters at Danish wastewater treatment facilities. In some samples, a member of the Hyd24-12 lineage was one of the most abundant genus-level bacterial taxa, accounting for up to 8% of the bacterial biomass. Three closely related and near-complete genomes were retrieved using metagenome sequencing of full-scale anaerobic digesters. Genome annotation and metabolic reconstruction showed that they are Gram-negative bacteria likely involved in acidogenesis, producing acetate and hydrogen from fermentation of sugars, and may play a role in the cycling of sulphur in the digesters. Fluorescence in situ hybridization revealed single rod-shaped cells dispersed within the flocs. The genomic information forms a foundation for a more detailed understanding of their role in anaerobic digestion and provides the first insight into a hitherto undescribed branch in the tree of life

    Model-based probe set optimization for high-performance microarrays

    Get PDF
    A major challenge in microarray design is the selection of highly specific oligonucleotide probes for all targeted genes of interest, while maintaining thermodynamic uniformity at the hybridization temperature. We introduce a novel microarray design framework (Thermodynamic Model-based Oligo Design Optimizer, TherMODO) that for the first time incorporates a number of advanced modelling features: (i) A model of position-dependent labelling effects that is quantitatively derived from experiment. (ii) Multi-state thermodynamic hybridization models of probe binding behaviour, including potential cross-hybridization reactions. (iii) A fast calibrated sequence-similarity-based heuristic for cross-hybridization prediction supporting large-scale designs. (iv) A novel compound score formulation for the integrated assessment of multiple probe design objectives. In contrast to a greedy search for probes meeting parameter thresholds, this approach permits an optimization at the probe set level and facilitates the selection of highly specific probe candidates while maintaining probe set uniformity. (v) Lastly, a flexible target grouping structure allows easy adaptation of the pipeline to a variety of microarray application scenarios. The algorithm and features are discussed and demonstrated on actual design runs. Source code is available on request

    High-resolution spatial and genomic characterization of coral-associated microbial aggregates in the coral Stylophora pistillata

    Get PDF
    Bacteria commonly form aggregates in a range of coral species [termed coral-associated microbial aggregates (CAMAs)], although these structures remain poorly characterized despite extensive efforts studying the coral microbiome. Here, we comprehensively characterize CAMAs associated with Stylophora pistillata and quantify their cell abundance. Our analysis reveals that multiple Endozoicomonas phylotypes coexist inside a single CAMA. Nanoscale secondary ion mass spectrometry imaging revealed that the Endozoicomonas cells were enriched with phosphorus, with the elemental compositions of CAMAs different from coral tissues and endosymbiotic Symbiodiniaceae, highlighting a role in sequestering and cycling phosphate between coral holobiont partners. Consensus metagenome--assembled genomes of the two dominant Endozoicomonas phylotypes confirmed their metabolic potential for polyphosphate accumulation along with genomic signatures including type VI secretion systems allowing host association. Our findings provide unprecedented insights into Endozoicomonas-dominated CAMAs and the first direct physiological and genomic linked evidence of their biological role in the coral holobiont

    HiSpOD: probe design for functional DNA microarrays.

    Get PDF
    International audienceMOTIVATION: The use of DNA microarrays allows the monitoring of the extreme microbial diversity encountered in complex samples like environmental ones as well as that of their functional capacities. However, no probe design software currently available is adapted to easily design efficient and explorative probes for functional gene arrays. RESULTS: We present a new efficient functional microarray probe design algorithm called HiSpOD (High Specific Oligo Design). This uses individual nucleic sequences or consensus sequences produced by multiple alignments to design highly specific probes. Indeed, to bypass crucial problem of cross-hybridizations, probe specificity is assessed by similarity search against a large formatted database dedicated to microbial communities containing about 10 million coding sequences (CDS). For experimental validation, a microarray targeting genes encoding enzymes involved in chlorinated solvent biodegradation was built. The results obtained from a contaminated environmental sample proved the specificity and the sensitivity of probes designed with the HiSpOD program. AVAILABILITY: http://fc.isima.fr/~g2im/hispod/

    Development and evaluation of a high-throughput, low-cost genotyping platform based on oligonucleotide microarrays in rice

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>We report the development of a microarray platform for rapid and cost-effective genetic mapping, and its evaluation using rice as a model. In contrast to methods employing whole-genome tiling microarrays for genotyping, our method is based on low-cost spotted microarray production, focusing only on known polymorphic features.</p> <p>Results</p> <p>We have produced a genotyping microarray for rice, comprising 880 single feature polymorphism (SFP) elements derived from insertions/deletions identified by aligning genomic sequences of the <it>japonica </it>cultivar Nipponbare and the <it>indica </it>cultivar 93-11. The SFPs were experimentally verified by hybridization with labeled genomic DNA prepared from the two cultivars. Using the genotyping microarrays, we found high levels of polymorphism across diverse rice accessions, and were able to classify all five subpopulations of rice with high bootstrap support. The microarrays were used for mapping of a gene conferring resistance to <it>Magnaporthe grisea</it>, the causative organism of rice blast disease, by quantitative genotyping of samples from a recombinant inbred line population pooled by phenotype.</p> <p>Conclusion</p> <p>We anticipate this microarray-based genotyping platform, based on its low cost-per-sample, to be particularly useful in applications requiring whole-genome molecular marker coverage across large numbers of individuals.</p
    corecore