19 research outputs found

    tRNA signatures reveal polyphyletic origins of streamlined SAR11 genomes among the alphaproteobacteria

    Get PDF
    Phylogenomic analyses are subject to bias from compositional convergence and noise from horizontal gene transfer (HGT). Compositional convergence is a likely cause of controversy regarding phylogeny of the SAR11 group of Alphaproteobacteria that have extremely streamlined, A+T-biased genomes. While careful modeling can reduce artifacts caused by convergence, the most consistent and robust phylogenetic signal in genomes may lie distributed among encoded functional features that govern macromolecular interactions. Here we develop a novel phyloclassification method based on signatures derived from bioinformatically defined tRNA Class-Informative Features (CIFs). tRNA CIFs are enriched for features that underlie tRNA-protein interactions. Using a simple tRNA-CIF-based phyloclassifier, we obtained results consistent with those of bias-corrected whole proteome phylogenomic studies, rejecting monophyly of SAR11 and affiliating most strains with Rhizobiales with strong statistical support. Yet SAR11 and Rickettsiales tRNA genes share distinct patterns of A+T-richness, as expected from their elevated genomic A+T compositions. Using conventional supermatrix methods on total tRNA sequence data, we could recover the artifactual result of a monophyletic SAR11 grouping with Rickettsiales. Thus tRNA CIF-based phyloclassification is more robust to base content convergence than supermatrix phylogenomics on whole tRNA sequences. Also, given the notoriously promiscuous HGT of aminoacyl-tRNA synthetases, tRNA CIF-based phyloclassification may be relatively robust to HGT of network components. We describe how unique features of tRNA-protein interaction networks facilitate the mining of traits governing macromolecular interactions from genomic data, and discuss why interaction-governing traits may be especially useful to solve difficult problems in microbial classification and phylogeny

    tRNA functional signatures classify plastids as late-branching cyanobacteria.

    Get PDF
    BackgroundEukaryotes acquired the trait of oxygenic photosynthesis through endosymbiosis of the cyanobacterial progenitor of plastid organelles. Despite recent advances in the phylogenomics of Cyanobacteria, the phylogenetic root of plastids remains controversial. Although a single origin of plastids by endosymbiosis is broadly supported, recent phylogenomic studies are contradictory on whether plastids branch early or late within Cyanobacteria. One underlying cause may be poor fit of evolutionary models to complex phylogenomic data.ResultsUsing Posterior Predictive Analysis, we show that recently applied evolutionary models poorly fit three phylogenomic datasets curated from cyanobacteria and plastid genomes because of heterogeneities in both substitution processes across sites and of compositions across lineages. To circumvent these sources of bias, we developed CYANO-MLP, a machine learning algorithm that consistently and accurately phylogenetically classifies ("phyloclassifies") cyanobacterial genomes to their clade of origin based on bioinformatically predicted function-informative features in tRNA gene complements. Classification of cyanobacterial genomes with CYANO-MLP is accurate and robust to deletion of clades, unbalanced sampling, and compositional heterogeneity in input tRNA data. CYANO-MLP consistently classifies plastid genomes into a late-branching cyanobacterial sub-clade containing single-cell, starch-producing, nitrogen-fixing ecotypes, consistent with metabolic and gene transfer data.ConclusionsPhylogenomic data of cyanobacteria and plastids exhibit both site-process heterogeneities and compositional heterogeneities across lineages. These aspects of the data require careful modeling to avoid bias in phylogenomic estimation. Furthermore, we show that amino acid recoding strategies may be insufficient to mitigate bias from compositional heterogeneities. However, the combination of our novel tRNA-specific strategy with machine learning in CYANO-MLP appears robust to these sources of bias with high accuracy in phyloclassification of cyanobacterial genomes. CYANO-MLP consistently classifies plastids as late-branching Cyanobacteria, consistent with independent evidence from signature-based approaches and some previous phylogenetic studies

    Initiator tRNA genes template the 3' CCA end at high frequencies in bacteria.

    Get PDF
    BackgroundWhile the CCA sequence at the mature 3' end of tRNAs is conserved and critical for translational function, a genetic template for this sequence is not always contained in tRNA genes. In eukaryotes and Archaea, the CCA ends of tRNAs are synthesized post-transcriptionally by CCA-adding enzymes. In Bacteria, tRNA genes template CCA sporadically.ResultsIn order to understand the variation in how prokaryotic tRNA genes template CCA, we re-annotated tRNA genes in tRNAdb-CE database version 0.8. Among 132,129 prokaryotic tRNA genes, initiator tRNA genes template CCA at the highest average frequency (74.1%) over all functional classes except selenocysteine and pyrrolysine tRNA genes (88.1% and 100% respectively). Across bacterial phyla and a wide range of genome sizes, many lineages exist in which predominantly initiator tRNA genes template CCA. Convergent and parallel retention of CCA templating in initiator tRNA genes evolved in independent histories of reductive genome evolution in Bacteria. Also, in a majority of cyanobacterial and actinobacterial genera, predominantly initiator tRNA genes template CCA. We also found that a surprising fraction of archaeal tRNA genes template CCA.ConclusionsWe suggest that cotranscriptional synthesis of initiator tRNA CCA 3' ends can complement inefficient processing of initiator tRNA precursors, "bootstrap" rapid initiation of protein synthesis from a non-growing state, or contribute to an increase in cellular growth rates by reducing overheads of mass and energy to maintain nonfunctional tRNA precursor pools. More generally, CCA templating in structurally non-conforming tRNA genes can afford cells robustness and greater plasticity to respond rapidly to environmental changes and stimuli

    High-resolution metagenomic reconstruction of the freshwater spring bloom

    Full text link
    Background The phytoplankton spring bloom in freshwater habitats is a complex, recurring, and dynamic ecological spectacle that unfolds at multiple biological scales. Although enormous taxonomic shifts in microbial assemblages during and after the bloom have been reported, genomic information on the microbial community of the spring bloom remains scarce. Results We performed a high-resolution spatio-temporal sampling of the spring bloom in a freshwater reservoir and describe a multitude of previously unknown taxa using metagenome-assembled genomes of eukaryotes, prokaryotes, and viruses in combination with a broad array of methodologies. The recovered genomes reveal multiple distributional dynamics for several bacterial groups with progressively increasing stratification. Analyses of abundances of metagenome-assembled genomes in concert with CARD-FISH revealed remarkably similar in situ doubling time estimates for dominant genome-streamlined microbial lineages. Discordance between quantitations of cryptophytes arising from sequence data and microscopic identification suggested the presence of hidden, yet extremely abundant aplastidic cryptophytes that were confirmed by CARD-FISH analyses. Aplastidic cryptophytes are prevalent throughout the water column but have never been considered in prior models of plankton dynamics. We also recovered the first metagenomic-assembled genomes of freshwater protists (a diatom and a haptophyte) along with thousands of giant viral genomic contigs, some of which appeared similar to viruses infecting haptophytes but owing to lack of known representatives, most remained without any indication of their hosts. The contrasting distribution of giant viruses that are present in the entire water column to that of parasitic perkinsids residing largely in deeper waters allows us to propose giant viruses as the biological agents of top-down control and bloom collapse, likely in combination with bottom-up factors like a nutrient limitation. Conclusion We reconstructed thousands of genomes of microbes and viruses from a freshwater spring bloom and show that such large-scale genome recovery allows tracking of planktonic succession in great detail. However, integration of metagenomic information with other methodologies (e.g., microscopy, CARD-FISH) remains critical to reveal diverse phenomena (e.g., distributional patterns, in situ doubling times) and novel participants (e.g., aplastidic cryptophytes) and to further refine existing ecological models (e.g., factors affecting bloom collapse). This work provides a genomic foundation for future approaches towards a fine-scale characterization of the organisms in relation to the rapidly changing environment during the course of the freshwater spring bloom

    Tracing Evolution of Gene Transfer Agents Using Comparative Genomics

    Get PDF
    The accumulating evidence suggest that viruses and their components can be domesticated by their hosts, equipping them with convenient molecular toolkits for various functions. One of such domesticated system is Gene Transfer Agents (GTAs) that are produced by some bacteria and archaea. GTAs morphologically resemble small phage-like particles and contain random fragments of their host genome. They are produced only by a small fraction of the microbial population and are released through a lysis of the host cell. Bioinformatic analyses suggest that GTAs are especially abundant in the taxonomic class of Alphaproteobacteria, where they are vertically inherited and evolve as a part of their host genomes. In this work, we extensively analyze evolutionary patterns of alphaproteobacterial GTAs using comparative genomics, phylogenomics and machine learning methods. We initially develop an algorithm that validate the wide presence of GTA elements in alphaproteobacterial genomes, where they are generally mistaken for prophages due to their homology. Furthermore, we demonstrate that GTAs evolve under the selection that reduces the energetic cost of their production, indicating their importance for the conditions of the nutrient depletion. The genome-wide screenings of translational selection and coevolution signatures highlight the significance of GTAs as a stress-response adaptation for the horizontal gene transfer, revealing a set of previously unknown genes that could play a role in the GTA cycle. As production of GTAs leads to the host death, their maintenance is likely to be under a kin or group level selection. By combining our findings with accumulated body of knowledge, this work proposes a conceptual model illustrating the role of GTAs in bacterial populations and their persistence for hundreds of millions of years of evolution

    Microbial production and consumption of marine dissolved organic matter

    Get PDF
    Thesis (Ph. D.)--Joint Program in Oceanography/Applied Ocean Science and Engineering (Massachusetts Institute of Technology, Department of Biology; and the Woods Hole Oceanographic Institution), 2013.Cataloged from PDF version of thesis.Includes bibliographical references.Marine phytoplankton are the principal producers of oceanic dissolved organic matter (DOM), the organic substrate responsible for secondary production by heterotrophic microbes in the sea. Despite the importance of DOM in marine food webs, details regarding how marine microbes cycle DOM are limited, and few definitive connections have been made between specific producers and consumers. Consumption is thought to depend on the source of the DOM as well as the identity of the consumer; however, it remains unclear how phytoplankton diversity and DOM composition are related, and the metabolic pathways involved in the turnover of DOM by different microbial taxa are largely unknown. The motivation for this thesis is to examine the role of microbial diversity in determining the composition, lability, and physiological consumption of marine DOM. The chemical composition of DOM produced by marine phytoplankton was investigated at the molecular level using mass spectrometry. Results demonstrate that individual phytoplankton strains release a unique suite of organic compounds. Connections between DOM composition and the phylogenetic identity of the producing organism were identified on multiple levels, revealing a direct relationship between phytoplankton diversity and DOM composition. Phytoplankton-derived DOM was also employed in growth assays with oligotrophic bacterioplankton strains to examine effects on heterotrophic growth dynamics. Reproducible responses ranged from suppressed to enhanced growth rates and cell yields, and depended both on the identity of the heterotroph and the source of the DOM. Novel relationships between specific bacterioplankton types and DOM from known biological sources were found, and targets for additional studies on reactive DOM components were identified. The physiology of DOM consumption by a marine Oceanospirillales strain was studied using a combined transcriptomic and untargeted metabolomic approach. The transcriptional response of this bacterium to Prochlorococcus-derived DOM revealed an increase in anabolic processes related to metabolism of carboxylic acids and glucosides, increased gene expression related to proteorhodopsin-based phototrophy, and decreased gene expression related to motility. Putative identification of compounds present in Prochlorococcus-derived DOM supported these responses. Collectively, these findings highlight the potential for linking detailed chemical analyses of labile DOM from a known biological source with bacterioplankton diversity and physiology.by Jamie William Becker.Ph.D

    Polysaccharide utilization loci and associated genes in marine Bacteroidetes - compositional diversity and ecological relevance

    Get PDF
    The synthesis of marine organic carbon compounds by photosynthetic macroalgae, microalgae (phytoplankton) and bacteria provide a basis for life in the ocean. In marine surface waters this primary production is largely dominated by microalgae and is especially pronounced during spring phytoplankton blooms. During and after these often diatom-dominated blooms, increased amounts of organic matter are released into the surrounding waters. Here, the organic matter, rich in polysaccharides, can trigger blooms of heterotrophic bacteria. Marine members of the Bacteroidetes are consistently found related to such bloom events. These bacteria are regularly detected as the first responders to thrive after phytoplankton spring blooms in temperate coastal regions and are often equipped with a variety of polysaccharide utilization gene clusters. These gene clusters, termed polysaccharide utilization loci (PULs), encode enzymes for the extracellular hydrolysis of polysaccharides and the subsequent uptake of oligosaccharides into the periplasm, where they are shielded from competing bacteria. This mechanism allows for rapid uptake and substrate hoarding, and thus could be one reason why Bacteroidetes are often seen as the first responders of the bacterioplankton community. The investigation of the so far largely unknown diversity and the ecological relevance of PULs in marine Bacteroidetes was the major goal of the work presented here. We could show that genomes of Bacteroidetes isolates from the North Sea, with free-living to micro- and macro-algae associated lifestyles, harboured a variety of these loci predicted to target in total 18 different substrate classes. Overall PUL repertoires of these isolates showed considerable intra-genus and inter-genus, variations suggesting that Bacteroidetes species harbour distinct glycan niches, independent of their phylogenetic relationships. By investigating the PUL repertoires of uncultured free-living Bacteroidetes during three consecutive years of spring phytoplankton blooms at the North Sea island of Helgoland, I could further reveal that the set of targeted substrates during these bloom events was dominated by only five of the substrate classes targeted by the isolates. These were the diatom storage polysaccharide laminarin, alpha-glucans, alginates, as well as substrates rich in alpha-mannans and sulfated xylans. In addition to this constrained set of substrate classes targeted by the free-living Bacteroidetes community, I could show that the species diversity during these blooms was limited and dominated by only 27 abundant and recurrent species that carried a limited number of abundant PULs. The majority of these PULs were targeting laminarin and alpha-glucan substrates, which were likely targeted during the entire time of the blooms. The less frequent PULs, targeting alpha-mannans and sulfated xylans, were predominantly detected during mid- and late- bloom phases, suggesting a relevance of these two substrate classes in the later phases of phytoplankton blooms. Overall these findings highlight the recurrence of a few specialized Bacteroidetes species and the environmental relevance of specific polysaccharide substrate classes during spring phytoplankton blooms. However, for some of these substrate classes the origin, structural details and their abundance during blooms are as yet largely unknown. To further shed light on the polysaccharide niches of abundant key-players, these findings can serve as a guide for future laboratory studies

    Free-living Diazotrophs and the Nitrogen Cycle in Natural Grassland Revealed by Culture Dependent and Independent Approaches

    Get PDF
    Biological nitrogen fixation contributes to half of the global supply of nitrogen to the biosphere. It is carried out by a diverse group of prokaryotes called diazotrophs via the nitrogenase enzyme. Nitrogen fixation research is focused on the narrow group of symbiotic diazotrophs, and the vast majority of free-living diazotrophs which contribute significantly to fixed nitrogen are yet to be explored. The goal of this research was to access phylogeny of diazotrophs considering the most up-to-date genomic information and apply that knowledge to understand the diversity of free-living diazotrophs in a natural grassland ecosystem, both by culture dependent and independent methods. Phylogeny was reconstructed using the concatenated sequences of six core proteins of nitrogenase (NifHDKENB) from 963 prokaryotic genomes. The diversity of free-living diazotrophs in grassland was explored by isolation of putative diazotrophs on a solid nitrogen free medium (NFM) and diazotrophy confirmed by nifH PCR, acetylene reduction assay and 15N2 assimilation assay. Streptomyces, the most abundant bacteria, was further characterized by sequencing the genome of one prominent strain, and differential gene expression in nitrogen rich Vs nitrogen deficient medium. For culture independent study of nitrogen cycle activity, meta-transcriptomic sequencing of complete mRNA from a grassland soil sample was performed. Phylogeny of nif genes from the complete genomes of cultured isolates revealed that diazotrophs are distributed across Actinobacteria, Aquificae, Bacteroidetes, Chlorobi, Chloroflexi, Cyanobacteria, Deferribacteres, Firmicutes, Fusobacteria, Nitrospira, Proteobacteria, PVC group, and Spirochaetes, as well as the Euryarchaeota, providing a curated database of nif genes. Culturing yielded 474 bacterial isolates which belonged to the phyla Actinobacteria, Proteobacteria, Firmicutes, and Bacteroidetes. However, only 81 (17%) of isolates yielded nifH, and the most dominant genus isolated on NFM, Streptomyces did not provide biochemical and genomic evidence of diazotrophy. The meta-transcriptomic study revealed nitrogen fixation and nitrification are the least and nitrate reduction is the most expressed pathway among various nitrogen cycling pathways. In conclusion, although the culture-based approach showed diverse free-living nitrogen fixing bacteria, diazotrophy should always be confirmed by biochemical and genetic evidence, and limitations to culture independent study due to primer bias in nifH PCR can be overcome by meta-transcriptomic study
    corecore