53 research outputs found

    Accurate Genome Relative Abundance Estimation Based on Shotgun Metagenomic Reads

    Get PDF
    Accurate estimation of microbial community composition based on metagenomic sequencing data is fundamental for subsequent metagenomics analysis. Prevalent estimation methods are mainly based on directly summarizing alignment results or its variants; often result in biased and/or unstable estimates. We have developed a unified probabilistic framework (named GRAMMy) by explicitly modeling read assignment ambiguities, genome size biases and read distributions along the genomes. Maximum likelihood method is employed to compute Genome Relative Abundance of microbial communities using the Mixture Model theory (GRAMMy). GRAMMy has been demonstrated to give estimates that are accurate and robust across both simulated and real read benchmark datasets. We applied GRAMMy to a collection of 34 metagenomic read sets from four metagenomics projects and identified 99 frequent species (minimally 0.5% abundant in at least 50% of the data- sets) in the human gut samples. Our results show substantial improvements over previous studies, such as adjusting the over-estimated abundance for Bacteroides species for human gut samples, by providing a new reference-based strategy for metagenomic sample comparisons. GRAMMy can be used flexibly with many read assignment tools (mapping, alignment or composition-based) even with low-sensitivity mapping results from huge short-read datasets. It will be increasingly useful as an accurate and robust tool for abundance estimation with the growing size of read sets and the expanding database of reference genomes

    The Marine Microbial Eukaryote Transcriptome Sequencing Project (MMETSP): illuminating the functional diversity of eukaryotic life in the oceans through transcriptome sequencing.

    Get PDF
    Microbial ecology is plagued by problems of an abstract nature. Cell sizes are so small and population sizes so large that both are virtually incomprehensible. Niches are so far from our everyday experience as to make their very definition elusive. Organisms that may be abundant and critical to our survival are little understood, seldom described and/or cultured, and sometimes yet to be even seen. One way to confront these problems is to use data of an even more abstract nature: molecular sequence data. Massive environmental nucleic acid sequencing, such as metagenomics or metatranscriptomics, promises functional analysis of microbial communities as a whole, without prior knowledge of which organisms are in the environment or exactly how they are interacting. But sequence-based ecological studies nearly always use a comparative approach, and that requires relevant reference sequences, which are an extremely limited resource when it comes to microbial eukaryotes. In practice, this means sequence databases need to be populated with enormous quantities of data for which we have some certainties about the source. Most important is the taxonomic identity of the organism from which a sequence is derived and as much functional identification of the encoded proteins as possible. In an ideal world, such information would be available as a large set of complete, well curated, and annotated genomes for all the major organisms from the environment in question. Reality substantially diverges from this ideal, but at least for bacterial molecular ecology, there is a database consisting of thousands of complete genomes from a wide range of taxa, supplemented by a phylogeny-driven approach to diversifying genomics [2]. For eukaryotes, the number of available genomes is far, far fewer, and we have relied much more heavily on random growth of sequence databases, raising the question as to whether this is fit for purpose

    The Marine Microbial Eukaryote Transcriptome Sequencing Project (MMETSP): illuminating the functional diversity of eukaryotic life in the oceans through transcriptome sequencing

    Get PDF
    International audienceCurrent sampling of genomic sequence data from eukaryotes is relatively poor, biased, and inadequate to address important questions about their biology, evolution, and ecology; this Community Page describes a resource of 700 transcriptomes from marine microbial eukaryotes to help understand their role in the world's oceans

    Phylogeography of Ostreopsis along West Pacific Coast, with Special Reference to a Novel Clade from Japan

    Get PDF
    BACKGROUND: A dinoflagellate genus Ostreopsis is known as a potential producer of Palytoxin derivatives. Palytoxin is the most potent non-proteinaceous compound reported so far. There has been a growing number of reports on palytoxin-like poisonings in southern areas of Japan; however, the distribution of Ostreopsis has not been investigated so far. Morphological plasticity of Ostreopsis makes reliable microscopic identification difficult so the employment of molecular tools was desirable. METHODS/PRINCIPAL FINDING: In total 223 clones were examined from samples mainly collected from southern areas of Japan. The D8-D10 region of the nuclear large subunit rDNA (D8-D10) was selected as a genetic marker and phylogenetic analyses were conducted. Although most of the clones were unable to be identified, there potentially 8 putative species established during this study. Among them, Ostreopsis sp. 1-5 did not belong to any known clade, and each of them formed its own clade. The dominant species was Ostreopsis sp. 1, which accounted for more than half of the clones and which was highly toxic and only distributed along the Japanese coast. Comparisons between the D8-D10 and the Internal Transcribed Spacer (ITS) region of the nuclear rDNA, which has widely been used for phylogenetic/phylogeographic studies in Ostreopsis, revealed that the D8-D10 was less variable than the ITS, making consistent and reliable phylogenetic reconstruction possible. CONCLUSIONS/SIGNIFICANCE: This study unveiled a surprisingly diverse and widespread distribution of Japanese Ostreopsis. Further study will be required to better understand the phylogeography of the genus. Our results posed the urgent need for the development of the early detection/warning systems for Ostreopsis, particularly for the widely distributed and strongly toxic Ostreopsis sp. 1. The D8-D10 marker will be suitable for these purposes

    The biogeographic differentiation of algal microbiomes in the upper ocean from pole to pole

    Get PDF
    Eukaryotic phytoplankton are responsible for at least 20% of annual global carbon fixation. Their diversity and activity are shaped by interactions with prokaryotes as part of complex microbiomes. Although differences in their local species diversity have been estimated, we still have a limited understanding of environmental conditions responsible for compositional differences between local species communities on a large scale from pole to pole. Here, we show, based on pole-to-pole phytoplankton metatranscriptomes and microbial rDNA sequencing, that environmental differences between polar and non-polar upper oceans most strongly impact the large-scale spatial pattern of biodiversity and gene activity in algal microbiomes. The geographic differentiation of co-occurring microbes in algal microbiomes can be well explained by the latitudinal temperature gradient and associated break points in their beta diversity, with an average breakpoint at 14 °C ± 4.3, separating cold and warm upper oceans. As global warming impacts upper ocean temperatures, we project that break points of beta diversity move markedly pole-wards. Hence, abrupt regime shifts in algal microbiomes could be caused by anthropogenic climate change
    • …
    corecore