102 research outputs found

    Interpolative multidimensional scaling techniques for the identification of clusters in very large sequence sets

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Modern pyrosequencing techniques make it possible to study complex bacterial populations, such as <it>16S rRNA</it>, directly from environmental or clinical samples without the need for laboratory purification. Alignment of sequences across the resultant large data sets (100,000+ sequences) is of particular interest for the purpose of identifying potential gene clusters and families, but such analysis represents a daunting computational task. The aim of this work is the development of an efficient pipeline for the clustering of large sequence read sets.</p> <p>Methods</p> <p>Pairwise alignment techniques are used here to calculate genetic distances between sequence pairs. These methods are pleasingly parallel and have been shown to more accurately reflect accurate genetic distances in highly variable regions of <it>rRNA </it>genes than do traditional multiple sequence alignment (MSA) approaches. By utilizing Needleman-Wunsch (NW) pairwise alignment in conjunction with novel implementations of interpolative multidimensional scaling (MDS), we have developed an effective method for visualizing massive biosequence data sets and quickly identifying potential gene clusters.</p> <p>Results</p> <p>This study demonstrates the use of interpolative MDS to obtain clustering results that are qualitatively similar to those obtained through full MDS, but with substantial cost savings. In particular, the wall clock time required to cluster a set of 100,000 sequences has been reduced from seven hours to less than one hour through the use of interpolative MDS.</p> <p>Conclusions</p> <p>Although work remains to be done in selecting the optimal training set size for interpolative MDS, substantial computational cost savings will allow us to cluster much larger sequence sets in the future.</p

    Groundtruthing next-gen sequencing for microbial ecology-biases and errors in community structure estimates from PCR amplicon pyrosequencing

    Get PDF
    Analysis of microbial communities by high-throughput pyrosequencing of SSU rRNA gene PCR amplicons has transformed microbial ecology research and led to the observation that many communities contain a diverse assortment of rare taxa-a phenomenon termed the Rare Biosphere. Multiple studies have investigated the effect of pyrosequencing read quality on operational taxonomic unit (OTU) richness for contrived communities, yet there is limited information on the fidelity of community structure estimates obtained through this approach. Given that PCR biases are widely recognized, and further unknown biases may arise from the sequencing process itself, a priori assumptions about the neutrality of the data generation process are at best unvalidated. Furthermore, post-sequencing quality control algorithms have not been explicitly evaluated for the accuracy of recovered representative sequences and its impact on downstream analyses, reducing useful discussion on pyrosequencing reads to their diversity and abundances. Here we report on community structures and sequences recovered for in vitro-simulated communities consisting of twenty 16S rRNA gene clones tiered at known proportions. PCR amplicon libraries of the V3-V4 and V6 hypervariable regions from the in vitro-simulated communities were sequenced using the Roche 454 GS FLX Titanium platform. Commonly used quality control protocols resulted in the formation of OTUs with >1% abundance composed entirely of erroneous sequences, while over-aggressive clustering approaches obfuscated real, expected OTUs. The pyrosequencing process itself did not appear to impose significant biases on overall community structure estimates, although the detection limit for rare taxa may be affected by PCR amplicon size and quality control approach employed. Meanwhile, PCR biases associated with the initial amplicon generation may impose greater distortions in the observed community structure

    Transcending Microbial Source Tracking Techniques Across Geographic Borders: An Examination of Human and Animal Microbiomes and the Integration of Molecular Approaches in Pathogen Surveillance in Brazil and the United States

    Get PDF
    Waterborne illnesses, attributed to the ingestion or contact with contaminated water, present a significant global health concern. Surface water sources can be impacted by wide array of pollution inputs, but fecal pollution generates the most significant and acute threat to human health. Therefore, the detection of fecal bacteria in surface water sources remains an important public health objective. Current surface water monitoring employs the use of fecal indicator bacteria (FIB) including E. coli and enterococci as proxies for pathogenic organisms carried in fecal pollution. These traditional indicators, detected by culture-based microbiological methods, do not discriminate fecal sources from another. New molecular approaches in pathogen surveillance, such as microbial source tracking (MST) and fecal-associated signatures, are culture-independent and are better suited for both the detection and identification of fecal pollution sources. By identifying fecal pollution sources, human health risks can be more accurately assessed and remediation strategies can be effectively implemented. This paper examines a variety of MST markers, and the basis for these by integrating in host source microbiome studies. Chapter 2 describes work with Catellicoccus marimammalium, where next generation sequencing demonstrates this marker is a dominant member of the gull microbiome. This work has important implications for reconciling high fecal indicator levels at beaches with health risk. Chapter 3 extends MST work to areas of poor sanitation in Jenipapo, Brazil. The distribution of human specific indicators in surface water fecal contamination and prevalence of the waterborne illness schistosomiasis is described. Lastly, Chapter 4 explores the microbial community of humans and animals across different geographic regions, Brazil and the United States, to evaluate the applicability of existing MST methods, assess host-specific organisms and fecal-associated bacterial groups, and investigate the potential to develop new and geographically-appropriate markers

    A Metagenomic Approach to Characterization of the Vaginal Microbiome Signature in Pregnancy

    Get PDF
    While current major national research efforts (i.e., the NIH Human Microbiome Project) will enable comprehensive metagenomic characterization of the adult human microbiota, how and when these diverse microbial communities take up residence in the host and during reproductive life are unexplored at a population level. Because microbial abundance and diversity might differ in pregnancy, we sought to generate comparative metagenomic signatures across gestational age strata. DNA was isolated from the vagina (introitus, posterior fornix, midvagina) and the V5V3 region of bacterial 16S rRNA genes were sequenced (454FLX Titanium platform). Sixty-eight samples from 24 healthy gravidae (18 to 40 confirmed weeks) were compared with 301 non-pregnant controls (60 subjects). Generated sequence data were quality filtered, taxonomically binned, normalized, and organized by phylogeny and into operational taxonomic units (OTU); principal coordinates analysis (PCoA) of the resultant beta diversity measures were used for visualization and analysis in association with sample clinical metadata. Altogether, 1.4 gigabytes of data containing >2.5 million reads (averaging 6,837 sequences/sample of 493 nt in length) were generated for computational analyses. Although gravidae were not excluded by virtue of a posterior fornix pH >4.5 at the time of screening, unique vaginal microbiome signature encompassing several specific OTUs and higher-level clades was nevertheless observed and confirmed using a combination of phylogenetic, non-phylogenetic, supervised, and unsupervised approaches. Both overall diversity and richness were reduced in pregnancy, with dominance of Lactobacillus species (L. iners crispatus, jensenii and johnsonii, and the orders Lactobacillales (and Lactobacillaceae family), Clostridiales, Bacteroidales, and Actinomycetales. This intergroup comparison using rigorous standardized sampling protocols and analytical methodologies provides robust initial evidence that the vaginal microbial 16S rRNA gene catalogue uniquely differs in pregnancy, with variance of taxa across vaginal subsite and gestational age

    Genomic Signal Processing Techniques for Taxonomy Prediction

    Get PDF
    To analyze complex biodiversity in microbial communities, 16S rRNA marker gene sequences are often assigned to operational taxonomic units (OTUs). The abundance of methods that have been used to assign 16S rRNA marker gene sequences into OTUs brings discussions in which one is better. Suggestions on having clustering methods should be stable in which generated OTU assignments do not change as additional sequences are added to the dataset is contradicting some other researches contend that the methods should properly present the distances of sequences is more important. We add one more de novo clustering algorithm, Rolling Snowball to existing ones including the single linkage, complete linkage, average linkage, abundance-based greedy clustering, distance-based greedy clustering, and Swarm and the open and closed-reference methods. We use GreenGenes, RDP, and SILVA 16S rRNA gene databases to show the success of the method. The highest accuracy is obtained with SILVA library

    Metagenomics : tools and insights for analyzing next-generation sequencing data derived from biodiversity studies

    Get PDF
    Advances in next-generation sequencing (NGS) have allowed significant breakthroughs in microbial ecology studies. This has led to the rapid expansion of research in the field and the establishment of “metagenomics”, often defined as the analysis of DNA from microbial communities in environmental samples without prior need for culturing. Many metagenomics statistical/computational tools and databases have been developed in order to allow the exploitation of the huge influx of data. In this review article, we provide an overview of the sequencing technologies and how they are uniquely suited to various types of metagenomic studies. We focus on the currently available bioinformatics techniques, tools, and methodologies for performing each individual step of a typical metagenomic dataset analysis. We also provide future trends in the field with respect to tools and technologies currently under development. Moreover, we discuss data management, distribution, and integration tools that are capable of performing comparative metagenomic analyses of multiple datasets using well-established databases, as well as commonly used annotation standards

    Species-level classification of the vaginal microbiome

    Get PDF

    Community and genomic analysis of the human small intestine microbiota

    Get PDF
    Our intestinal tract is densely populated by different microbes, collectively called microbiota, of which the majority are bacteria. Research focusing on the intestinal microbiota often use fecal samples as a representative of the bacteria that inhabit the end of the large intestine. These studies revealed that the intestinal bacteria contribute to our health, which has stimulated the interest in understanding their dynamics and activities. However, bacterial communities in fecal samples are different compared to microbial communities at other locations in the intestinal tract, such as the small intestine. Despite that the small intestine is the first region where our food and intestinal microbiota meet, we know little about the bacteria in the small intestine and how they influence our overall well-being. This is mainly attributable to difficulties in obtaining samples with the small intestine being located between the stomach and the large intestine. Therefore, the work in this thesis aimed at providing a better understanding of the composition and dynamics of the human small intestinal microbiota and to provide insight in the metabolic potential as well as immunomodulatory properties of some of its typical commensal inhabitants. Small intestinal samples used in the work described in this thesis were collected from ileostomy subjects, individuals that had their large intestine surgically removed and the end of the small intestine connected to an abdominal stoma, providing access to luminal content of the small intestine. Considering the importance of molecular techniques in contemporary ecological surveys of microbial communities, first of all, two technologies, barcoded pyrosequencing and phylogenetic microarray analysis were compared in terms of their capacity to determine the bacterial composition in fecal and small intestinal samples from human individuals. As PCR remains a crucial step in sample preparation for both techniques, the use of different primer pairs in the amplification step was assessed in terms of its impact on the outcome of microbial profiling. The analyses revealed that the different primer pairs and the two profiling technologies provide overall similar results for samples of fecal and terminal ileum origin. In contrast, the microbial profiles obtained for small intestinal samples by barcoded pyrosequencing and phylogenetic microarray analysesdiffered considerably. This is most likely attributable to the constraints that are intrinsic to the use of the microarray to enable the detection of predefined microbiota members only, which is due to the probe design that is largely based on large intestinal microbiota communities. However, the pyrosequencing technology also allows for identification of bacteria that were not in advance known to inhabit our intestinal tract. The pyrosequencing technology was used as the method of choice to study the total and active small intestinal communities in ileostoma effluent samples from four different subjects through sequencing the 16S ribosomal RNA gene (rDNA) and ribosomal RNA (rRNA) contentcombined with metatranscriptome analysis by Illumina sequencing of cDNA derived from enriched mRNAof the same sample set to investigate the activities of the small intestinal bacteria. The composition of the small intestinal bacterial communities as assessed from rDNA, rRNA, and mRNA patterns appeared to be similar, indicating that the dominant bacteria in the small intestine are also highly active in this ecosystem. Streptococcusspp. were among the bacterial species that were detected in each ileostoma effluent sample, albeit that their abundance varied greatly between samples from the same subject as well as samples from different subjects. Veillonellaspp. frequently co-occurred with Streptococcus spp., indicating that the Streptococcusand Veillonellapopulations play a prominent role in the human small intestine ecosystem and their co-occurrence suggests a metabolic relation between these genera. Therefore, cultivation and molecular typing methodologies were employed to zoom-in on these groups, which revealed that the richness of the small intestinal streptococci strongly exceeded the diversity that could be estimated on basis of 16S rRNA analyses, and could be extended to the genomic lineage level (anticipated to resemble strain-level). From ileostoma samples 3 different Streptococcusspecies were recovered belonging to the S. mitisgroup, S. bovisgroup, and S. salivariusgroup, which could be further divided in 7 genomic lineages. Notably, the Streptococcuslineages that were isolated displayed distinct carbohydrate utilization capacities, which may imply that their growth and relative community composition may respond quite strongly to differences in the dietary intake of simple carbohydrates over time. This notion is in good agreement with the observation that the Streptococcuslineage populations fluctuated in time with only one Streptococcuslineage being cultivated from both ileostoma samples collected in a one-year time frame. Conversely, the cultivated Veillonellaisolates from samples during that same time-interval consistently encompassed a single lineage. Furthermore, this Veillonellalineage could be isolated from both the oral cavity as well as the ileostoma effluent. Analogously, three Streptococcuslineages that belong to a single phylotype also appeared to be present in bacterial communities from the oral cavity as well as the small intestine. These observations suggest the representatives of the Veillonellaand Streptococcusgenera that are encountered in the oral and small intestinal microbial ecosystems are closely related and indicate that the oral microbiota may serve as an inoculum for the upper GI tract. The metabolic capacity of 6 small intestinal Streptococcus lineages, that were obtained from a single ileostoma effluent sample, was further investigated through the determination of genomic sequences of these lineages. The small-intestinal Streptococcusgenomes were found to encode different carbohydrate transporters and the necessary enzymes to metabolize different sugars, which was in excellent agreement with what carbohydrates could be used by representative strains of the Streptococcuslineages. To further our understanding how the different streptococci as representatives of the dominant small intestinal bacterial populations may influence our immune system, human dendritic cells were stimulated with strains of the different Streptococcuslineages to study their immunomodulatory properties. The Streptococcuslineages differed significantly in their capacity to modulate cytokine responses of blood-monocyte derived immature dendritic cells. As Streptococcusand Veillonellafrequently co-occur in the small intestinal ecosystem, pair-wise combinations of strains of these two species were also tested for their combined immunomodulatory properties. This resulted in considerably different cytokine responses as those that could be predicted from the stimulations with either Streptococcusor Veillonella, indicating that it is not trivial to predict gut mucosal associated immune responses and thatthe composition of the intestinal microbiota as a whole may have a distinct influence on an individual’s immune status. In conclusion, the work described this thesis provides an expansion to the accumulating knowledge on the human intestine microbiota. Whereas most studies focus on the microbiota present in the distal regions of the intestinal tract, this study targeted the microbiota of the poorly proximal regions of the intestine and also addressed its capacity to interact with the local mucosal tissue. The data presented here can be exploited to guide the design of future studies that aim to elucidate the interplay between diet, microbiota and the mucosal tissues in the human small intestinal tract.</p
    corecore