121 research outputs found
Dark matter in archaeal genomes: a rich source of novel mobile elements, defense systems and secretory complexes
International audienceMicrobial genomes encompass a sizable fraction of poorly characterized, narrowly spread fast-evolving genes. Using sensitive methods for sequences comparison and protein structure prediction, we performed a detailed comparative analysis of clusters of such genes, which we denote "dark matter islands", in archaeal genomes. The dark matter islands comprise up to 20% of archaeal genomes and show remarkable heterogeneity and diversity. Nevertheless, three classes of entities are common in these genomic loci: (a) integrated viral genomes and other mobile elements; (b) defense systems, and (c) secretory and other membrane-associated systems. The dark matter islands in the genome of thermophiles and mesophiles show similar general trends of gene content, but thermophiles are substantially enriched in predicted membrane proteins whereas mesophiles have a greater proportion of recognizable mobile elements. Based on this analysis, we predict the existence of several novel groups of viruses and mobile elements, previously unnoticed variants of CRISPR-Cas immune systems, and new secretory systems that might be involved in stress response, intermicrobial conflicts and biogenesis of novel, uncharacterized membrane structures
eggNOG v3.0: orthologous groups covering 1133 organisms at 41 different taxonomic ranges
Orthologous relationships form the basis of most comparative genomic and metagenomic studies and are essential for proper phylogenetic and functional analyses. The third version of the eggNOG database (http://eggnog.embl.de) contains non-supervised orthologous groups constructed from 1133 organisms, doubling the number of genes with orthology assignment compared to eggNOG v2. The new release is the result of a number of improvements and expansions: (i) the underlying homology searches are now based on the SIMAP database; (ii) the orthologous groups have been extended to 41 levels of selected taxonomic ranges enabling much more fine-grained orthology assignments; and (iii) the newly designed web page is considerably faster with more functionality. In total, eggNOG v3 contains 721 801 orthologous groups, encompassing a total of 4 396 591 genes. Additionally, we updated 4873 and 4850 original COGs and KOGs, respectively, to include all 1133 organisms. At the universal level, covering all three domains of life, 101 208 orthologous groups are available, while the others are applicable at 40 more limited taxonomic ranges. Each group is amended by multiple sequence alignments and maximum-likelihood trees and broad functional descriptions are provided for 450 904 orthologous groups (62.5%
eggNOG v4.0: nested orthology inference across 3686 organisms
With the increasing availability of various ‘omics data, high-quality orthology assignment is crucial for evolutionary and functional genomics studies. We here present the fourth version of the eggNOG database (available at http://eggnog.embl.de) that derives nonsupervised orthologous groups (NOGs) from complete genomes, and then applies a comprehensive characterization and analysis pipeline to the resulting gene families. Compared with the previous version, we have more than tripled the underlying species set to cover 3686 organisms, keeping track with genome project completions while prioritizing the inclusion of high-quality genomes to minimize error propagation from incomplete proteome sets. Major technological advances include (i) a robust and scalable procedure for the identification and inclusion of high-quality genomes, (ii) provision of orthologous groups for 107 different taxonomic levels compared with 41 in eggNOGv3, (iii) identification and annotation of particularly closely related orthologous groups, facilitating analysis of related gene families, (iv) improvements of the clustering and functional annotation approach, (v) adoption of a revised tree building procedure based on the multiple alignments generated during the process and (vi) implementation of quality control procedures throughout the entire pipeline. As in previous versions, eggNOGv4 provides multiple sequence alignments and maximum-likelihood trees, as well as broad functional annotation. Users can access the complete database of orthologous groups via a web interface, as well as through bulk downloa
Functional reconstruction of a eukaryotic-like E1/E2/(RING) E3 ubiquitylation cascade from an uncultured archaeon.
The covalent modification of protein substrates by ubiquitin regulates a diverse range of critical biological functions. Although it has been established that ubiquitin-like modifiers evolved from prokaryotic sulphur transfer proteins it is less clear how complex eukaryotic ubiquitylation system arose and diversified from these prokaryotic antecedents. The discovery of ubiquitin, E1-like, E2-like and small-RING finger (srfp) protein components in the Aigarchaeota and the Asgard archaea superphyla has provided a substantive step toward addressing this evolutionary question. Encoded in operons, these components are likely representative of the progenitor apparatus that founded the modern eukaryotic ubiquitin modification systems. Here we report that these proteins from the archaeon Candidatus 'Caldiarchaeum subterraneum' operate together as a bona fide ubiquitin modification system, mediating a sequential ubiquitylation cascade reminiscent of the eukaryotic process. Our observations support the hypothesis that complex eukaryotic ubiquitylation signalling pathways have developed from compact systems originally inherited from an archaeal ancestor
Novel insights into the Thaumarchaeota in the deepest oceans: their metabolism and potential adaptation mechanisms
Background: Marine Group I (MGI) Thaumarchaeota, which play key roles in the global biogeochemical cycling of nitrogen and carbon (ammonia oxidizers), thrive in the aphotic deep sea with massive populations. Recent studies have revealed that MGI Thaumarchaeota were present in the deepest part of oceans - the hadal zone (depth > 6,000 m, consisting almost entirely of trenches), with the predominant phylotype being distinct from that in the “shallower” deep sea. However, little is known about the metabolism and distribution of these ammonia oxidizers in the hadal water. Results: In this study, metagenomic data were obtained from 0-10,500 m deep seawater samples from the Mariana Trench. The distribution patterns of Thaumarchaeota derived from metagenomics and 16S rRNA gene sequencing were in line with that reported in previous studies: abundance of Thaumarchaeota peaked in bathypelagic zone (depth 1,000 – 4,000 m) and the predominant clade shifted in the hadal zone. Several metagenome-assembled thaumarchaeotal genomes were recovered, including a near-complete one representing the dominant hadal phylotype of MGI. Using comparative genomics we predict that unexpected genes involved in bioenergetics, including two distinct ATP synthase genes (predicted to be coupled with H+ and Na+ respectively), and genes horizontally transferred from other extremophiles, such as those encoding putative di-myo-inositol-phosphate (DIP) synthases, might significantly contribute to the success of this hadal clade under the extreme condition. We also found that hadal MGI have the genetic potential to import a far higher range of organic compounds than their shallower water counterparts. Despite this trait, hadal MDI ammonia oxidation and carbon fixation genes are highly transcribed providing evidence they are likely autotrophic, contributing to the primary production in the aphotic deep sea. Conclusions: Our study reveals potentially novel adaptation mechanisms of deep-sea thaumarchaeotal clades and suggests key functions of deep-sea Thaumarchaeota in carbon and nitrogen cycling
Protein phosphorylation and its role in archaeal signal transduction
Reversible protein phosphorylation is the main mechanism of signal transduction that enables cells to rapidly respond to
environmental changes by controlling the functional properties of proteins in response to external stimuli. However,
whereas signal transduction is well studied in Eukaryotes and Bacteria, the knowledge in Archaea is still rather scarce.
Archaea are special with regard to protein phosphorylation, due to the fact that the two best studied phyla, the
Euryarchaeota and Crenarchaeaota, seem to exhibit fundamental differences in regulatory systems. Euryarchaeota (e.g.
halophiles, methanogens, thermophiles), like Bacteria and Eukaryotes, rely on bacterial-type two-component signal
transduction systems (phosphorylation on His and Asp), as well as on the protein phosphorylation on Ser, Thr and Tyr by
Hanks-type protein kinases. Instead, Crenarchaeota (e.g. acidophiles and (hyper)thermophiles) only depend on Hanks-type
protein phosphorylation. In this review, the current knowledge of reversible protein phosphorylation in Archaea is
presented. It combines results from identified phosphoproteins, biochemical characterization of protein kinases and
protein phosphatases as well as target enzymes and first insights into archaeal signal transduction by biochemical, genetic and polyomic studie
Clusters of orthologous genes for 41 archaeal genomes and implications for evolutionary genomics of archaea
<p>Abstract</p> <p>Background</p> <p>An evolutionary classification of genes from sequenced genomes that distinguishes between orthologs and paralogs is indispensable for genome annotation and evolutionary reconstruction. Shortly after multiple genome sequences of bacteria, archaea, and unicellular eukaryotes became available, an attempt on such a classification was implemented in Clusters of Orthologous Groups of proteins (COGs). Rapid accumulation of genome sequences creates opportunities for refining COGs but also represents a challenge because of error amplification. One of the practical strategies involves construction of refined COGs for phylogenetically compact subsets of genomes.</p> <p>Results</p> <p>New Archaeal Clusters of Orthologous Genes (arCOGs) were constructed for 41 archaeal genomes (13 Crenarchaeota, 27 Euryarchaeota and one Nanoarchaeon) using an improved procedure that employs a similarity tree between smaller, group-specific clusters, semi-automatically partitions orthology domains in multidomain proteins, and uses profile searches for identification of remote orthologs. The annotation of arCOGs is a consensus between three assignments based on the COGs, the CDD database, and the annotations of homologs in the NR database. The 7538 arCOGs, on average, cover ~88% of the genes in a genome compared to a ~76% coverage in COGs. The finer granularity of ortholog identification in the arCOGs is apparent from the fact that 4538 arCOGs correspond to 2362 COGs; ~40% of the arCOGs are new. The archaeal gene core (protein-coding genes found in all 41 genome) consists of 166 arCOGs. The arCOGs were used to reconstruct gene loss and gene gain events during archaeal evolution and gene sets of ancestral forms. The Last Archaeal Common Ancestor (LACA) is conservatively estimated to possess 996 genes compared to 1245 and 1335 genes for the last common ancestors of Crenarchaeota and Euryarchaeota, respectively. It is inferred that LACA was a chemoautotrophic hyperthermophile that, in addition to the core archaeal functions, encoded more idiosyncratic systems, e.g., the CASS systems of antivirus defense and some toxin-antitoxin systems.</p> <p>Conclusion</p> <p>The arCOGs provide a convenient, flexible framework for functional annotation of archaeal genomes, comparative genomics and evolutionary reconstructions. Genomic reconstructions suggest that the last common ancestor of archaea might have been (nearly) as advanced as the modern archaeal hyperthermophiles. ArCOGs and related information are available at: <url>ftp://ftp.ncbi.nih.gov/pub/koonin/arCOGs/</url>.</p> <p>Reviewers</p> <p>This article was reviewed by Peer Bork, Patrick Forterre, and Purificacion Lopez-Garcia.</p
Diversity and Potential Multifunctionality of Archaeal CetZ Tubulin-like Cytoskeletal Proteins.
Tubulin superfamily (TSF) proteins are widespread, and are known for their multifaceted roles as cytoskeletal proteins underpinning many basic cellular functions, including morphogenesis, division, and motility. In eukaryotes, tubulin assembles into microtubules, a major component of the dynamic cytoskeletal network of fibres, whereas the bacterial homolog FtsZ assembles the division ring at midcell. The functions of the lesser-known archaeal TSF proteins are beginning to be identified and show surprising diversity, including homologs of tubulin and FtsZ as well as a third archaea-specific family, CetZ, implicated in the regulation of cell shape and possibly other unknown functions. In this study, we define sequence and structural characteristics of the CetZ family and CetZ1 and CetZ2 subfamilies, identify CetZ groups and diversity amongst archaea, and identify potential functional relationships through analysis of the genomic neighbourhoods of cetZ genes. We identified at least three subfamilies of orthologous CetZ proteins in the archaeal class Halobacteria, including CetZ1 and CetZ2 as well as a novel uncharacterized subfamily. CetZ1 and CetZ2 were correlated to one another as well as to cell shape and motility phenotypes across diverse Halobacteria. Among other known CetZ clusters in orders Archaeoglobales, Methanomicrobiales, Methanosarcinales, and Thermococcales, an additional uncharacterized group from Archaeoglobales and Methanomicrobiales is affiliated strongly with Halobacteria CetZs, suggesting that they originated via horizontal transfer. Subgroups of Halobacteria CetZ2 and Thermococcales CetZ genes were found adjacent to different type IV pili regulons, suggesting potential utilization of CetZs by type IV systems. More broadly conserved cetZ gene neighbourhoods include nucleotide and cofactor biosynthesis (e.g., F420) and predicted cell surface sugar epimerase genes. These findings imply that CetZ subfamilies are involved in multiple functions linked to the cell surface, biosynthesis, and motility
Discovery of extremely halophilic, methyl-reducing euryarchaea provides insights into the evolutionary origin of methanogenesis
Methanogenic archaea are major players in the global carbon cycle and in the biotechnology of anaerobic digestion. The phylum Euryarchaeota includes diverse groups of methanogens that are interspersed with non-methanogenic lineages. So far, methanogens inhabiting hypersaline environments have been identified only within the order Methanosarcinales. We report the discovery of a deep phylogenetic lineage of extremophilic methanogens in hypersaline lakes and present analysis of two nearly complete genomes from this group. Within the phylum Euryarchaeota, these isolates form a separate, class-level lineage 'Methanonatronarchaeia' that is most closely related to the class Halobacteria. Similar to the Halobacteria, 'Methanonatronarchaeia' are extremely halophilic and do not accumulate organic osmoprotectants. The high intracellular concentration of potassium implies that 'Methanonatronarchaeia' employ the 'salt-in' osmoprotection strategy. These methanogens are heterotrophic methyl-reducers that use C 1 -methylated compounds as electron acceptors and formate or hydrogen as electron donors. The genomes contain an incomplete and apparently inactivated set of genes encoding the upper branch of methyl group oxidation to CO2 as well as membrane-bound heterodisulfide reductase and cytochromes. These features differentiate 'Methanonatronarchaeia' from all known methyl-reducing methanogens. The discovery of extremely halophilic, methyl-reducing methanogens related to haloarchaea provides insights into the origin of methanogenesis and shows that the strategies employed by methanogens to thrive in salt-saturating conditions are not limited to the classical methylotrophic pathway.Accepted Author ManuscriptBT/Environmental Biotechnolog
- …
