53 research outputs found

    Phylogenomic analysis of the GIY-YIG nuclease superfamily

    Get PDF
    BACKGROUND: The GIY-YIG domain was initially identified in homing endonucleases and later in other selfish mobile genetic elements (including restriction enzymes and non-LTR retrotransposons) and in enzymes involved in DNA repair and recombination. However, to date no systematic search for novel members of the GIY-YIG superfamily or comparative analysis of these enzymes has been reported. RESULTS: We carried out database searches to identify all members of known GIY-YIG nuclease families. Multiple sequence alignments together with predicted secondary structures of identified families were represented as Hidden Markov Models (HMM) and compared by the HHsearch method to the uncharacterized protein families gathered in the COG, KOG, and PFAM databases. This analysis allowed for extending the GIY-YIG superfamily to include members of COG3680 and a number of proteins not classified in COGs and to predict that these proteins may function as nucleases, potentially involved in DNA recombination and/or repair. Finally, all old and new members of the GIY-YIG superfamily were compared and analyzed to infer the phylogenetic tree. CONCLUSION: An evolutionary classification of the GIY-YIG superfamily is presented for the very first time, along with the structural annotation of all (sub)families. It provides a comprehensive picture of sequence-structure-function relationships in this superfamily of nucleases, which will help to design experiments to study the mechanism of action of known members (especially the uncharacterized ones) and will facilitate the prediction of function for the newly discovered ones

    Structural and evolutionary bioinformatics of the SPOUT superfamily of methyltransferases

    Get PDF
    BACKGROUND: SPOUT methyltransferases (MTases) are a large class of S-adenosyl-L-methionine-dependent enzymes that exhibit an unusual alpha/beta fold with a very deep topological knot. In 2001, when no crystal structures were available for any of these proteins, Anantharaman, Koonin, and Aravind identified homology between SpoU and TrmD MTases and defined the SPOUT superfamily. Since then, multiple crystal structures of knotted MTases have been solved and numerous new homologous sequences appeared in the databases. However, no comprehensive comparative analysis of these proteins has been carried out to classify them based on structural and evolutionary criteria and to guide functional predictions. RESULTS: We carried out extensive searches of databases of protein structures and sequences to collect all members of previously identified SPOUT MTases, and to identify previously unknown homologs. Based on sequence clustering, characterization of domain architecture, structure predictions and sequence/structure comparisons, we re-defined families within the SPOUT superfamily and predicted putative active sites and biochemical functions for the so far uncharacterized members. We have also delineated the common core of SPOUT MTases and inferred a multiple sequence alignment for the conserved knot region, from which we calculated the phylogenetic tree of the superfamily. We have also studied phylogenetic distribution of different families, and used this information to infer the evolutionary history of the SPOUT superfamily. CONCLUSION: We present the first phylogenetic tree of the SPOUT superfamily since it was defined, together with a new scheme for its classification, and discussion about conservation of sequence and structure in different families, and their functional implications. We identified four protein families as new members of the SPOUT superfamily. Three of these families are functionally uncharacterized (COG1772, COG1901, and COG4080), and one (COG1756 represented by Nep1p) has been already implicated in RNA metabolism, but its biochemical function has been unknown. Based on the inference of orthologous and paralogous relationships between all SPOUT families we propose that the Last Universal Common Ancestor (LUCA) of all extant organisms contained at least three SPOUT members, ancestors of contemporary RNA MTases that carry out m(1)G, m3U, and 2'O-ribose methylation, respectively. In this work we also speculate on the origin of the knot and propose possible 'unknotted' ancestors. The results of our analysis provide a comprehensive 'roadmap' for experimental characterization of SPOUT MTases and interpretation of functional studies in the light of sequence-structure relationships

    MODOMICS: a database of RNA modification pathways

    Get PDF
    MODOMICS is the first comprehensive database resource for systems biology of RNA modification. It integrates information about the chemical structure of modified nucleosides, their localization in RNA sequences, pathways of their biosynthesis and enzymes that carry out the respective reactions. MODOMICS also provides literature information, and links to other databases, including the available protein sequence and structure data. The current list of modifications and pathways is comprehensive, while the dataset of enzymes is limited to Escherichia coli and Saccharomyces cerevisiae and sequence alignments are presented only for tRNAs from these organisms. RNAs and enzymes from other organisms will be included in the near future. MODOMICS can be queried by the type of nucleoside (e.g. A, G, C, U, I, m(1)A, nm(5)s(2)U, etc.), type of RNA, position of a particular nucleoside, type of reaction (e.g. methylation, thiolation, deamination, etc.) and name or sequence of an enzyme of interest. Options for data presentation include graphs of pathways involving the query nucleoside, multiple sequence alignments of RNA sequences and tabular forms with enzyme and literature data. The contents of MODOMICS can be accessed through the World Wide Web at

    Muuseumiharidusest Eesti Rahva Muuseumi rahvakultuuri koolitus- ja teabekeskuse näitel

    Get PDF
    The alignment used for the calculations of the double-domain sequences tree. (FASTA 53 kb

    The yfhQ gene of Escherichia coli encodes a tRNA:Cm32/Um32 methyltransferase

    Get PDF
    BACKGROUND: Naturally occurring tRNAs contain numerous modified nucleosides. They are formed by enzymatic modification of the primary transcripts during the complex RNA maturation process. In model organisms Escherichia coli and Saccharomyces cerevisiae most enzymes involved in this process have been identified. Interestingly, it was found that tRNA methylation, one of the most common modifications, can be introduced by S-adenosyl-L-methionine (AdoMet)-dependent methyltransferases (MTases) that belong to two structurally and phylogenetically unrelated protein superfamilies: RFM and SPOUT. RESULTS: As a part of a large-scale project aiming at characterization of a complete set of RNA modification enzymes of model organisms, we have studied the Escherichia coli proteins YibK, LasT, YfhQ, and YbeA for their ability to introduce the last unassigned methylations of ribose at positions 32 and 34 of the tRNA anticodon loop. We found that YfhQ catalyzes the AdoMet-dependent formation of Cm32 or Um32 in tRNA(Ser1 )and tRNA(Gln2 )and that an E. coli strain with a disrupted yfhQ gene lacks the tRNA:Cm32/Um32 methyltransferase activity. Thus, we propose to rename YfhQ as TrMet(Xm32) according to the recently proposed, uniform nomenclature for all RNA modification enzymes, or TrmJ, according to the traditional nomenclature for bacterial tRNA MTases. CONCLUSION: Our results reveal that methylation at position 32 is carried out by completely unrelated TrMet(Xm32) enzymes in eukaryota and prokaryota (RFM superfamily member Trm7 and SPOUT superfamily member TrmJ, respectively), mirroring the scenario observed in the case of the m(1)G37 modification (introduced by the RFM member Trm5 in eukaryota and archaea, and by the SPOUT member TrmD in bacteria)

    A composite double-/single-stranded RNA-binding region in protein Prp3 supports tri-snRNP stability and splicing

    Get PDF
    Prp3 is an essential U4/U6 di-snRNP-associated protein whose functions and molecular mechanisms in pre-mRNA splicing are presently poorly understood. We show by structural and biochemical analyses that Prp3 contains a bipartite U4/U6 di-snRNA-binding region comprising an expanded ferredoxin-like fold, which recognizes a 3′-overhang of U6 snRNA, and a preceding peptide, which binds U4/U6 stem II. Phylogenetic analyses revealed that the single-stranded RNA-binding domain is exclusively found in Prp3 orthologs, thus qualifying as a spliceosome-specific RNA interaction module. The composite double-stranded /single-stranded RNA-binding region assembles cooperatively with Snu13 and Prp31 on U4/U6 di-snRNAs and inhibits Brr2-mediated U4/U6 di-snRNA unwinding in vitro. RNP-disrupting mutations in Prp3 lead to U4/U6•U5 tri-snRNP assembly and splicing defects in vivo. Our results reveal how Prp3 acts as an important bridge between U4/U6 and U5 in the tri-snRNP and comparison with a Prp24-U6 snRNA recycling complex suggests how Prp3 may be involved in U4/U6 reassembly after splicing

    Functional and bioinformatics analysis of two Campylobacter jejuni homologs of the thiol-disulfide oxidoreductase, DsbA.

    Get PDF
    BACKGROUND: Bacterial Dsb enzymes are involved in the oxidative folding of many proteins, through the formation of disulfide bonds between their cysteine residues. The Dsb protein network has been well characterized in cells of the model microorganism Escherichia coli. To gain insight into the functioning of the Dsb system in epsilon-Proteobacteria, where it plays an important role in the colonization process, we studied two homologs of the main Escherichia coli Dsb oxidase (EcDsbA) that are present in the cells of the enteric pathogen Campylobacter jejuni, the most frequently reported bacterial cause of human enteritis in the world. METHODS AND RESULTS: Phylogenetic analysis suggests the horizontal transfer of the epsilon-Proteobacterial DsbAs from a common ancestor to gamma-Proteobacteria, which then gave rise to the DsbL lineage. Phenotype and enzymatic assays suggest that the two C. jejuni DsbAs play different roles in bacterial cells and have divergent substrate spectra. CjDsbA1 is essential for the motility and autoagglutination phenotypes, while CjDsbA2 has no impact on those processes. CjDsbA1 plays a critical role in the oxidative folding that ensures the activity of alkaline phosphatase CjPhoX, whereas CjDsbA2 is crucial for the activity of arylsulfotransferase CjAstA, encoded within the dsbA2-dsbB-astA operon. CONCLUSIONS: Our results show that CjDsbA1 is the primary thiol-oxidoreductase affecting life processes associated with bacterial spread and host colonization, as well as ensuring the oxidative folding of particular protein substrates. In contrast, CjDsbA2 activity does not affect the same processes and so far its oxidative folding activity has been demonstrated for one substrate, arylsulfotransferase CjAstA. The results suggest the cooperation between CjDsbA2 and CjDsbB. In the case of the CjDsbA1, this cooperation is not exclusive and there is probably another protein to be identified in C. jejuni cells that acts to re-oxidize CjDsbA1. Altogether the data presented here constitute the considerable insight to the Epsilonproteobacterial Dsb systems, which have been poorly understood so far

    The RNase H-like superfamily: new members, comparative structural analysis and evolutionary classification.

    Get PDF
    Ribonuclease H-like (RNHL) superfamily, also called the retroviral integrase superfamily, groups together numerous enzymes involved in nucleic acid metabolism and implicated in many biological processes, including replication, homologous recombination, DNA repair, transposition and RNA interference. The RNHL superfamily proteins show extensive divergence of sequences and structures. We conducted database searches to identify members of the RNHL superfamily (including those previously unknown), yielding >60 000 unique domain sequences. Our analysis led to the identification of new RNHL superfamily members, such as RRXRR (PF14239), DUF460 (PF04312, COG2433), DUF3010 (PF11215), DUF429 (PF04250 and COG2410, COG4328, COG4923), DUF1092 (PF06485), COG5558, OrfB_IS605 (PF01385, COG0675) and Peptidase_A17 (PF05380). Based on the clustering analysis we grouped all identified RNHL domain sequences into 152 families. Phylogenetic studies revealed relationships between these families, and suggested a possible history of the evolution of RNHL fold and its active site. Our results revealed clear division of the RNHL superfamily into exonucleases and endonucleases. Structural analyses of features characteristic for particular groups revealed a correlation between the orientation of the C-terminal helix with the exonuclease/endonuclease function and the architecture of the active site. Our analysis provides a comprehensive picture of sequence-structure-function relationships in the RNHL superfamily that may guide functional studies of the previously uncharacterized protein families
    corecore