12 research outputs found

    MTR: taxonomic annotation of short metagenomic reads using clustering at multiple taxonomic ranks

    Get PDF
    Motivation: Metagenomics is a recent field of biology that studies microbial communities by analyzing their genomic content directly sequenced from the environment. A metagenomic dataset consists of many short DNA or RNA fragments called reads. One interesting problem in metagenomic data analysis is the discovery of the taxonomic composition of a given dataset. A simple method for this task, called the Lowest Common Ancestor (LCA), is employed in state-of-the-art computational tools for metagenomic data analysis of very short reads (about 100 bp). However LCA has two main drawbacks: it possibly assigns many reads to high taxonomic ranks and it discards a high number of reads

    Comparative fecal metagenomics unveils unique functional capacity of the swine gut

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Uncovering the taxonomic composition and functional capacity within the swine gut microbial consortia is of great importance to animal physiology and health as well as to food and water safety due to the presence of human pathogens in pig feces. Nonetheless, limited information on the functional diversity of the swine gut microbiome is available.</p> <p>Results</p> <p>Analysis of 637, 722 pyrosequencing reads (130 megabases) generated from Yorkshire pig fecal DNA extracts was performed to help better understand the microbial diversity and largely unknown functional capacity of the swine gut microbiome. Swine fecal metagenomic sequences were annotated using both MG-RAST and JGI IMG/M-ER pipelines. Taxonomic analysis of metagenomic reads indicated that swine fecal microbiomes were dominated by Firmicutes and Bacteroidetes phyla. At a finer phylogenetic resolution, <it>Prevotella </it>spp. dominated the swine fecal metagenome, while some genes associated with <it>Treponema </it>and <it>Anareovibrio </it>species were found to be exclusively within the pig fecal metagenomic sequences analyzed. Functional analysis revealed that carbohydrate metabolism was the most abundant SEED subsystem, representing 13% of the swine metagenome. Genes associated with stress, virulence, cell wall and cell capsule were also abundant. Virulence factors associated with antibiotic resistance genes with highest sequence homology to genes in Bacteroidetes, Clostridia, and <it>Methanosarcina </it>were numerous within the gene families unique to the swine fecal metagenomes. Other abundant proteins unique to the distal swine gut shared high sequence homology to putative carbohydrate membrane transporters.</p> <p>Conclusions</p> <p>The results from this metagenomic survey demonstrated the presence of genes associated with resistance to antibiotics and carbohydrate metabolism suggesting that the swine gut microbiome may be shaped by husbandry practices.</p

    Bioinformatics for the human microbiome project

    Get PDF
    Microbes inhabit virtually all sites of the human body, yet we know very little about the role they play in our health. In recent years, there has been increasing interest in studying human-associated microbial communities, particularly since microbial dysbioses have now been implicated in a number of human diseases [1]–[3]. Dysbiosis, the disruption of the normal microbial community structure, however, is impossible to define without first establishing what “normal microbial community structure” means within the healthy human microbiome. Recent advances in sequencing technologies have made it feasible to perform large-scale studies of microbial communities, providing the tools necessary to begin to address this question [4], [5]. This led to the implementation of the Human Microbiome Project (HMP) in 2007, an initiative funded by the National Institutes of Health Roadmap for Biomedical Research and constructed as a large, genome-scale community research project [6]. Any such project must plan for data analysis, computational methods development, and the public availability of tools and data; here, we provide an overview of the corresponding bioinformatics organization, history, and results from the HMP (Figure 1).National Institutes of Health (U.S.) (NIH U54HG004969)National Institutes of Health (U.S.) (grant R01HG004885)National Institutes of Health (U.S.) (grant R01HG005975)National Institutes of Health (U.S.) (grant R01HG005969

    Bioprospecting metagenomes: glycosyl hydrolases for converting biomass

    Get PDF
    Throughout immeasurable time, microorganisms evolved and accumulated remarkable physiological and functional heterogeneity, and now constitute the major reserve for genetic diversity on earth. Using metagenomics, namely genetic material recovered directly from environmental samples, this biogenetic diversification can be accessed without the need to cultivate cells. Accordingly, microbial communities and their metagenomes, isolated from biotopes with high turnover rates of recalcitrant biomass, such as lignocellulosic plant cell walls, have become a major resource for bioprospecting; furthermore, this material is a major asset in the search for new biocatalytics (enzymes) for various industrial processes, including the production of biofuels from plant feedstocks. However, despite the contributions from metagenomics technologies consequent upon the discovery of novel enzymes, this relatively new enterprise requires major improvements. In this review, we compare function-based metagenome screening and sequence-based metagenome data mining, discussing the advantages and limitations of both methods. We also describe the unusual enzymes discovered via metagenomics approaches, and discuss the future prospects for metagenome technologies

    Metabolic Reconstruction for Metagenomic Data and Its Application to the Human Microbiome

    Get PDF
    Microbial communities carry out the majority of the biochemical activity on the planet, and they play integral roles in processes including metabolism and immune homeostasis in the human microbiome. Shotgun sequencing of such communities' metagenomes provides information complementary to organismal abundances from taxonomic markers, but the resulting data typically comprise short reads from hundreds of different organisms and are at best challenging to assemble comparably to single-organism genomes. Here, we describe an alternative approach to infer the functional and metabolic potential of a microbial community metagenome. We determined the gene families and pathways present or absent within a community, as well as their relative abundances, directly from short sequence reads. We validated this methodology using a collection of synthetic metagenomes, recovering the presence and abundance both of large pathways and of small functional modules with high accuracy. We subsequently applied this method, HUMAnN, to the microbial communities of 649 metagenomes drawn from seven primary body sites on 102 individuals as part of the Human Microbiome Project (HMP). This provided a means to compare functional diversity and organismal ecology in the human microbiome, and we determined a core of 24 ubiquitously present modules. Core pathways were often implemented by different enzyme families within different body sites, and 168 functional modules and 196 metabolic pathways varied in metagenomic abundance specifically to one or more niches within the microbiome. These included glycosaminoglycan degradation in the gut, as well as phosphate and amino acid transport linked to host phenotype (vaginal pH) in the posterior fornix. An implementation of our methodology is available at http://huttenhower.sph.harvard.edu/human​n. This provides a means to accurately and efficiently characterize microbial metabolic pathways and functional modules directly from high-throughput sequencing reads, enabling the determination of community roles in the HMP cohort and in future metagenomic studies.National Institutes of Health (U.S.) (U54HG004968

    Metagenomic Systems Biology of the Human Microbiome

    Get PDF

    Development and quantitative analyses of a universal rRNA-subtraction protocol for microbial metatranscriptomics

    Get PDF
    Metatranscriptomes generated by pyrosequencing hold significant potential for describing functional processes in complex microbial communities. Meeting this potential requires protocols that maximize mRNA recovery by reducing the relative abundance of ribosomal RNA, as well as systematic comparisons to identify methodological artifacts and test for reproducibility across data sets. Here, we implement a protocol for subtractive hybridization of bacterial rRNA (16S and 23S) that uses sample-specific probes and is applicable across diverse environmental samples. To test this method, rRNA-subtracted and unsubtracted transcriptomes were sequenced (454 FLX technology) from bacterioplankton communities at two depths in the oligotrophic open ocean, yielding 10 data sets representing ~350 Mbp. Subtractive hybridization reduced bacterial rRNA transcript abundance by 40–58%, increasing recovery of non-rRNA sequences up to fourfold (from 12% to 20% of total sequences to 40–49%). In testing this method, we established criteria for detecting sequences replicated artificially via pyrosequencing errors and identified such replicates as a significant component (6–39%) of total pyrosequencing reads. Following replicate removal, statistical comparisons of reference genes (identified via BLASTX to NCBI-nr) between technical replicates and between rRNA-subtracted and unsubtracted samples showed low levels of differential transcript abundance (<0.2% of reference genes). However, gene overlap between data sets was remarkably low, with no two data sets (including duplicate runs from the same pyrosequencing library template) sharing greater than 17% of unique reference genes. These results indicate that pyrosequencing captures a small subset of total mRNA diversity and underscores the importance of reliable rRNA subtraction procedures to enhance sequencing coverage across the functional transcript pool.Agouron InstituteGordon and Betty Moore FoundationUnited States. Dept. of Energy. Office of ScienceNational Science Foundation (U.S.) (NSF Science and Technology Center Award EF0424599

    DOE Joint Genome Institute 2008 Progress Report

    Full text link

    A metagenomic analysis of the epiphytic bacterial community from the green macroalga Ulva australis

    Full text link
    In the marine environment, the surface of macroalgae are colonised by complex microbial communities, which are known to interact with their hosts in a variety of ways. Despite the importance of macroalgae to coastal ecosystems, comprehensive assessments of algal associated bacterial communities are rare. This thesis describes a metagenomic analysis of the epiphytic microbial community of the green macroalga, Ulva australis. A DNA extraction method was developed, which was selective for and representative of the bacterial community from the algal surface. Samples of U. australis, and for comparison, seawater, were collected with spatial and temporal replication, and analysed by the creation of large 16S rRNA gene clone libraries, metagenomic sequencing, and the creation and functional screening of large insert fosmid libraries. 16S rRNA gene analysis revealed that the U. australis bacterial community was almost completely distinct from the planktonic seawater community, but also highly variable between algal samples, at the level of species. The analysis of metagenomic sequencing data revealed that the seawater and algal communities were also functionally distinct. In addition, despite the high level of species variability, a core set of functions were identified which were consistently detected in U. australis samples, and are indicative of a host and surface associated lifestyle. The observations of taxonomic distinctness from seawater, species level variability, and the presence of a consistent set of functional genes relevant to the algal surface environment, has been framed in terms of a competitive lottery model for colonisation of the U. australis surface. The remainder of this thesis describes the construction and functional screening of large insert (40kb) fosmid metagenomic libraries, of the algal associated community, for both antibacterial activity, and the induction of LuxR/I type quorum sensing systems. Two antibacterial and four quorum sensing inducing clones were detected, sequenced and partially chacterised, using subcloning, transposon mutagenesis, and chemical extraction and analysis. Metagenomic analysis has provided an overview of this complex host associated microbial community, in terms of both community membership and function
    corecore