53 research outputs found
Accurate Genome Relative Abundance Estimation Based on Shotgun Metagenomic Reads
Accurate estimation of microbial community composition based on metagenomic sequencing data is fundamental for subsequent metagenomics analysis. Prevalent estimation methods are mainly based on directly summarizing alignment results or its variants; often result in biased and/or unstable estimates. We have developed a unified probabilistic framework (named GRAMMy) by explicitly modeling read assignment ambiguities, genome size biases and read distributions along the genomes. Maximum likelihood method is employed to compute Genome Relative Abundance of microbial communities using the Mixture Model theory (GRAMMy). GRAMMy has been demonstrated to give estimates that are accurate and robust across both simulated and real read benchmark datasets. We applied GRAMMy to a collection of 34 metagenomic read sets from four metagenomics projects and identified 99 frequent species (minimally 0.5% abundant in at least 50% of the data- sets) in the human gut samples. Our results show substantial improvements over previous studies, such as adjusting the over-estimated abundance for Bacteroides species for human gut samples, by providing a new reference-based strategy for metagenomic sample comparisons. GRAMMy can be used flexibly with many read assignment tools (mapping, alignment or composition-based) even with low-sensitivity mapping results from huge short-read datasets. It will be increasingly useful as an accurate and robust tool for abundance estimation with the growing size of read sets and the expanding database of reference genomes
The Marine Microbial Eukaryote Transcriptome Sequencing Project (MMETSP): illuminating the functional diversity of eukaryotic life in the oceans through transcriptome sequencing.
Microbial ecology is plagued by problems
of an abstract nature. Cell sizes are so
small and population sizes so large that
both are virtually incomprehensible. Niches
are so far from our everyday experience
as to make their very definition elusive.
Organisms that may be abundant and
critical to our survival are little understood,
seldom described and/or cultured,
and sometimes yet to be even seen. One
way to confront these problems is to use
data of an even more abstract nature:
molecular sequence data. Massive environmental
nucleic acid sequencing, such
as metagenomics or metatranscriptomics,
promises functional analysis of microbial
communities as a whole, without prior
knowledge of which organisms are in the
environment or exactly how they are
interacting. But sequence-based ecological
studies nearly always use a comparative
approach, and that requires relevant
reference sequences, which are an extremely
limited resource when it comes to
microbial eukaryotes.
In practice, this means sequence databases
need to be populated with enormous
quantities of data for which we have
some certainties about the source. Most
important is the taxonomic identity of
the organism from which a sequence is
derived and as much functional identification
of the encoded proteins as possible. In
an ideal world, such information would be
available as a large set of complete, well curated,
and annotated genomes for all the
major organisms from the environment
in question. Reality substantially diverges
from this ideal, but at least for bacterial
molecular ecology, there is a database
consisting of thousands of complete genomes
from a wide range of taxa,
supplemented by a phylogeny-driven approach
to diversifying genomics [2]. For
eukaryotes, the number of available genomes
is far, far fewer, and we have relied
much more heavily on random growth of
sequence databases, raising the
question as to whether this is fit for
purpose
The Marine Microbial Eukaryote Transcriptome Sequencing Project (MMETSP): illuminating the functional diversity of eukaryotic life in the oceans through transcriptome sequencing
International audienceCurrent sampling of genomic sequence data from eukaryotes is relatively poor, biased, and inadequate to address important questions about their biology, evolution, and ecology; this Community Page describes a resource of 700 transcriptomes from marine microbial eukaryotes to help understand their role in the world's oceans
Phylogeography of Ostreopsis along West Pacific Coast, with Special Reference to a Novel Clade from Japan
BACKGROUND: A dinoflagellate genus Ostreopsis is known as a potential producer of Palytoxin derivatives. Palytoxin is the most potent non-proteinaceous compound reported so far. There has been a growing number of reports on palytoxin-like poisonings in southern areas of Japan; however, the distribution of Ostreopsis has not been investigated so far. Morphological plasticity of Ostreopsis makes reliable microscopic identification difficult so the employment of molecular tools was desirable. METHODS/PRINCIPAL FINDING: In total 223 clones were examined from samples mainly collected from southern areas of Japan. The D8-D10 region of the nuclear large subunit rDNA (D8-D10) was selected as a genetic marker and phylogenetic analyses were conducted. Although most of the clones were unable to be identified, there potentially 8 putative species established during this study. Among them, Ostreopsis sp. 1-5 did not belong to any known clade, and each of them formed its own clade. The dominant species was Ostreopsis sp. 1, which accounted for more than half of the clones and which was highly toxic and only distributed along the Japanese coast. Comparisons between the D8-D10 and the Internal Transcribed Spacer (ITS) region of the nuclear rDNA, which has widely been used for phylogenetic/phylogeographic studies in Ostreopsis, revealed that the D8-D10 was less variable than the ITS, making consistent and reliable phylogenetic reconstruction possible. CONCLUSIONS/SIGNIFICANCE: This study unveiled a surprisingly diverse and widespread distribution of Japanese Ostreopsis. Further study will be required to better understand the phylogeography of the genus. Our results posed the urgent need for the development of the early detection/warning systems for Ostreopsis, particularly for the widely distributed and strongly toxic Ostreopsis sp. 1. The D8-D10 marker will be suitable for these purposes
The biogeographic differentiation of algal microbiomes in the upper ocean from pole to pole
Eukaryotic phytoplankton are responsible for at least 20% of annual global carbon fixation. Their diversity and activity are shaped by interactions with prokaryotes as part of complex microbiomes. Although differences in their local species diversity have been estimated, we still have a limited understanding of environmental conditions responsible for compositional differences between local species communities on a large scale from pole to pole. Here, we show, based on pole-to-pole phytoplankton metatranscriptomes and microbial rDNA sequencing, that environmental differences between polar and non-polar upper oceans most strongly impact the large-scale spatial pattern of biodiversity and gene activity in algal microbiomes. The geographic differentiation of co-occurring microbes in algal microbiomes can be well explained by the latitudinal temperature gradient and associated break points in their beta diversity, with an average breakpoint at 14 °C ± 4.3, separating cold and warm upper oceans. As global warming impacts upper ocean temperatures, we project that break points of beta diversity move markedly pole-wards. Hence, abrupt regime shifts in algal microbiomes could be caused by anthropogenic climate change
- …