46 research outputs found

    Biome representational in silico karyotyping

    Get PDF
    Metagenomic characterization of complex biomes remains challenging. Here we describe a modification of digital karyotyping—biome representational in silico karyotyping (BRISK)—as a general technique for analyzing a defined representation of all DNA present in a sample. BRISK utilizes a Type IIB DNA restriction enzyme to create a defined representation of 27-mer DNAs in a sample. Massively parallel sequencing of this representation allows for construction of high-resolution karyotypes and identification of multiple species within a biome. Application to normal human tissue demonstrated linear recovery of tags by chromosome. We apply this technique to the biome of the oral mucosa and find that greater than 25% of recovered DNA is nonhuman. DNA from 41 microbial species could be identified from oral mucosa of two subjects. Of recovered nonhuman sequences, fewer than 30% are currently annotated. We characterized seven prevalent unknown sequences by chromosome walking and find these represent novel microbial sequences including two likely derived from novel phage genomes. Application of BRISK to archival tissue from a nasopharyngeal carcinoma resulted in identification of Epstein-Barr virus infection. These results suggest that BRISK is a powerful technique for the analysis of complex microbiomes and potentially for pathogen discovery

    A Year of Infection in the Intensive Care Unit: Prospective Whole Genome Sequencing of Bacterial Clinical Isolates Reveals Cryptic Transmissions and Novel Microbiota

    No full text
    <div><p>Bacterial whole genome sequencing holds promise as a disruptive technology in clinical microbiology, but it has not yet been applied systematically or comprehensively within a clinical context. Here, over the course of one year, we performed prospective collection and whole genome sequencing of nearly all bacterial isolates obtained from a tertiary care hospital’s intensive care units (ICUs). This unbiased collection of 1,229 bacterial genomes from 391 patients enables detailed exploration of several features of clinical pathogens. A sizable fraction of isolates identified as clinically relevant corresponded to previously undescribed species: 12% of isolates assigned a species-level classification by conventional methods actually qualified as distinct, novel genomospecies on the basis of genomic similarity. Pan-genome analysis of the most frequently encountered pathogens in the collection revealed substantial variation in pan-genome size (1,420 to 20,432 genes) and the rate of gene discovery (1 to 152 genes per isolate sequenced). Surprisingly, although potential nosocomial transmission of actively surveilled pathogens was rare, 8.7% of isolates belonged to genomically related clonal lineages that were present among multiple patients, usually with overlapping hospital admissions, and were associated with clinically significant infection in 62% of patients from which they were recovered. Multi-patient clonal lineages were particularly evident in the neonatal care unit, where seven separate <i>Staphylococcus epidermidis</i> clonal lineages were identified, including one lineage associated with bacteremia in 5/9 neonates. Our study highlights key differences in the information made available by conventional microbiological practices versus whole genome sequencing, and motivates the further integration of microbial genome sequencing into routine clinical care.</p></div

    Clustering of sequenced isolates by genomic similarity.

    No full text
    <p>(<b>A</b>) Network diagram of all 1,229 sequenced isolates that could be assigned to one of 78 clusters on the basis of pairwise ANIb. Each node represents an individual isolate, and is colored black if robustly matching a previously reported genome (≥ 95% ANIb) or white if corresponding to a novel genomospecies. Nodes connected by a visible edge indicate pairwise ANIb values ≥ 95%. Edges connecting isolates within the same cluster are colored according to that cluster, edges connecting isolates that match multiple clusters are grey. Clusters are labeled according to the most detailed taxonomic classification given to isolates during conventional identification. (<b>B</b>) Network diagram of 419 isolates corresponding to novel genomospecies, assigned to 53 clusters on the basis of pairwise ANIb. Clusters are labeled as in (A). For both panels, the length of edges between nodes is not informative or proportional to ANIb values, and consequently neither is the placement of specific nodes or groups within the graph. The amount of connectivity among nodes indicates the basis of their inclusion with respect to specific groups.</p
    corecore