58 research outputs found
Accurate Genome Relative Abundance Estimation Based on Shotgun Metagenomic Reads
Accurate estimation of microbial community composition based on metagenomic sequencing data is fundamental for subsequent metagenomics analysis. Prevalent estimation methods are mainly based on directly summarizing alignment results or its variants; often result in biased and/or unstable estimates. We have developed a unified probabilistic framework (named GRAMMy) by explicitly modeling read assignment ambiguities, genome size biases and read distributions along the genomes. Maximum likelihood method is employed to compute Genome Relative Abundance of microbial communities using the Mixture Model theory (GRAMMy). GRAMMy has been demonstrated to give estimates that are accurate and robust across both simulated and real read benchmark datasets. We applied GRAMMy to a collection of 34 metagenomic read sets from four metagenomics projects and identified 99 frequent species (minimally 0.5% abundant in at least 50% of the data- sets) in the human gut samples. Our results show substantial improvements over previous studies, such as adjusting the over-estimated abundance for Bacteroides species for human gut samples, by providing a new reference-based strategy for metagenomic sample comparisons. GRAMMy can be used flexibly with many read assignment tools (mapping, alignment or composition-based) even with low-sensitivity mapping results from huge short-read datasets. It will be increasingly useful as an accurate and robust tool for abundance estimation with the growing size of read sets and the expanding database of reference genomes
The <i>Ectocarpus</i> genome and the independent evolution of multicellularity in brown algae
Brown algae (Phaeophyceae) are complex photosynthetic organisms with a very different evolutionary history to green plants, to which they are only distantly related1. These seaweeds are the dominant species in rocky coastal ecosystems and they exhibit many interesting adaptations to these, often harsh, environments. Brown algae are also one of only a small number of eukaryotic lineages that have evolved complex multicellularity (Fig. 1).We report the 214 million base pair (Mbp) genome sequence of the filamentous seaweed Ectocarpus siliculosus (Dillwyn) Lyngbye, a model organism for brown algae, closely related to the kelps (Fig. 1). Genome features such as the presence of an extended set of light-harvesting and pigment biosynthesis genes and new metabolic processes such as halide metabolism help explain the ability of this organism to cope with the highly variable tidal environment. The evolution of multicellularity in this lineage is correlated with the presence of a rich array of signal transduction genes. Of particular interest is the presence of a family of receptor kinases, as the independent evolution of related molecules has been linked with the emergence of multicellularity in both the animal and green plant lineages. The Ectocarpus genome sequence represents an important step towards developing this organism as a model species, providing the possibility to combine genomic and genetic2 approaches to explore these and other aspects of brown algal biology further
The Marine Microbial Eukaryote Transcriptome Sequencing Project (MMETSP): illuminating the functional diversity of eukaryotic life in the oceans through transcriptome sequencing.
Microbial ecology is plagued by problems
of an abstract nature. Cell sizes are so
small and population sizes so large that
both are virtually incomprehensible. Niches
are so far from our everyday experience
as to make their very definition elusive.
Organisms that may be abundant and
critical to our survival are little understood,
seldom described and/or cultured,
and sometimes yet to be even seen. One
way to confront these problems is to use
data of an even more abstract nature:
molecular sequence data. Massive environmental
nucleic acid sequencing, such
as metagenomics or metatranscriptomics,
promises functional analysis of microbial
communities as a whole, without prior
knowledge of which organisms are in the
environment or exactly how they are
interacting. But sequence-based ecological
studies nearly always use a comparative
approach, and that requires relevant
reference sequences, which are an extremely
limited resource when it comes to
microbial eukaryotes.
In practice, this means sequence databases
need to be populated with enormous
quantities of data for which we have
some certainties about the source. Most
important is the taxonomic identity of
the organism from which a sequence is
derived and as much functional identification
of the encoded proteins as possible. In
an ideal world, such information would be
available as a large set of complete, well curated,
and annotated genomes for all the
major organisms from the environment
in question. Reality substantially diverges
from this ideal, but at least for bacterial
molecular ecology, there is a database
consisting of thousands of complete genomes
from a wide range of taxa,
supplemented by a phylogeny-driven approach
to diversifying genomics [2]. For
eukaryotes, the number of available genomes
is far, far fewer, and we have relied
much more heavily on random growth of
sequence databases, raising the
question as to whether this is fit for
purpose
The Marine Microbial Eukaryote Transcriptome Sequencing Project (MMETSP): illuminating the functional diversity of eukaryotic life in the oceans through transcriptome sequencing
International audienceCurrent sampling of genomic sequence data from eukaryotes is relatively poor, biased, and inadequate to address important questions about their biology, evolution, and ecology; this Community Page describes a resource of 700 transcriptomes from marine microbial eukaryotes to help understand their role in the world's oceans
Phylogeography of Ostreopsis along West Pacific Coast, with Special Reference to a Novel Clade from Japan
BACKGROUND: A dinoflagellate genus Ostreopsis is known as a potential producer of Palytoxin derivatives. Palytoxin is the most potent non-proteinaceous compound reported so far. There has been a growing number of reports on palytoxin-like poisonings in southern areas of Japan; however, the distribution of Ostreopsis has not been investigated so far. Morphological plasticity of Ostreopsis makes reliable microscopic identification difficult so the employment of molecular tools was desirable. METHODS/PRINCIPAL FINDING: In total 223 clones were examined from samples mainly collected from southern areas of Japan. The D8-D10 region of the nuclear large subunit rDNA (D8-D10) was selected as a genetic marker and phylogenetic analyses were conducted. Although most of the clones were unable to be identified, there potentially 8 putative species established during this study. Among them, Ostreopsis sp. 1-5 did not belong to any known clade, and each of them formed its own clade. The dominant species was Ostreopsis sp. 1, which accounted for more than half of the clones and which was highly toxic and only distributed along the Japanese coast. Comparisons between the D8-D10 and the Internal Transcribed Spacer (ITS) region of the nuclear rDNA, which has widely been used for phylogenetic/phylogeographic studies in Ostreopsis, revealed that the D8-D10 was less variable than the ITS, making consistent and reliable phylogenetic reconstruction possible. CONCLUSIONS/SIGNIFICANCE: This study unveiled a surprisingly diverse and widespread distribution of Japanese Ostreopsis. Further study will be required to better understand the phylogeography of the genus. Our results posed the urgent need for the development of the early detection/warning systems for Ostreopsis, particularly for the widely distributed and strongly toxic Ostreopsis sp. 1. The D8-D10 marker will be suitable for these purposes
The biogeographic differentiation of algal microbiomes in the upper ocean from pole to pole
Eukaryotic phytoplankton are responsible for at least 20% of annual global carbon fixation. Their diversity and activity are shaped by interactions with prokaryotes as part of complex microbiomes. Although differences in their local species diversity have been estimated, we still have a limited understanding of environmental conditions responsible for compositional differences between local species communities on a large scale from pole to pole. Here, we show, based on pole-to-pole phytoplankton metatranscriptomes and microbial rDNA sequencing, that environmental differences between polar and non-polar upper oceans most strongly impact the large-scale spatial pattern of biodiversity and gene activity in algal microbiomes. The geographic differentiation of co-occurring microbes in algal microbiomes can be well explained by the latitudinal temperature gradient and associated break points in their beta diversity, with an average breakpoint at 14 °C ± 4.3, separating cold and warm upper oceans. As global warming impacts upper ocean temperatures, we project that break points of beta diversity move markedly pole-wards. Hence, abrupt regime shifts in algal microbiomes could be caused by anthropogenic climate change
The Marine Microbial Eukaryote Transcriptome Sequencing Project (MMETSP): Illuminating the Functional Diversity of Eukaryotic Life in the Oceans through Transcriptome Sequencing
Microbial ecology is plagued by problems of an abstract nature. Cell sizes are so small and population sizes so large that both are virtually incomprehensible. Niches are so far from our everyday experience as to make their very definition elusive. Organisms that may be abundant and critical to our survival are little understood, seldom described and/or cultured, and sometimes yet to be even seen. One way to confront these problems is to use data of an even more abstract nature: molecular sequence data. Massive environmental nucleic acid sequencing, such as metagenomics or metatranscriptomics, promises functional analysis of microbial communities as a whole, without prior knowledge of which organisms are in the environment or exactly how they are interacting. But sequence-based ecological studies nearly always use a comparative approach, and that requires relevant reference sequences, which are an extremely limited resource when it comes to microbial eukaryotes
- …