3 research outputs found

    Identification and comparative analysis of telomerase RNAs from Candida species reveal conservation of functional elements

    No full text
    The RNA component of telomerase (telomerase RNA; TER) varies substantially both in sequence composition and size (from ∼150 nucleotides [nt] to >1500 nt) across species. This dramatic divergence has hampered the identification of TER genes and a large-scale comparative analysis of TER sequences and structures among distantly related species. To identify by phylogenetic analysis conserved sequences and structural features of TER that are of general importance, it is essential to obtain TER sequences from evolutionarily distant groups of species, providing enough conservation within each group and enough variation among the groups. To this end, we identified TER genes in several yeast species with relatively large (>20 base pairs) and nonvariant telomeric repeats, mostly from the genus Candida. Interestingly, several of the TERs reported here are longer than all other yeast TERs known to date. Within these TERs, we predicted a pseudoknot containing U-A·U base triples (conserved in vertebrates, budding yeasts, and ciliates) and a three-way junction element (conserved in vertebrates and budding yeasts). In addition, we identified a novel conserved sequence (CS2a) predicted to reside within an internal-loop structure, in all the budding yeast TERs examined. CS2a is located near the Est1p-binding bulge-stem previously identified in Saccharomyces cerevisiae. Mutational analyses in both budding yeasts S. cerevisiae and Kluyveromyces lactis demonstrate that CS2a is essential for in vivo telomerase function. The comparative and mutational analyses of conserved TER elements reported here provide novel insights into the structure and function of the telomerase ribonucleoprotein complex

    OpenProt: a more comprehensive guide to explore eukaryotic coding potential and proteomes

    No full text
    International audienceAdvances in proteomics and sequencing have highlighted many non-annotated open reading frames (ORFs) in eukaryotic genomes. Genome annotations, cornerstones of today's research, mostly rely on protein prior knowledge and on ab initio prediction algorithms. Such algorithms notably enforce an arbitrary criterion of one coding sequence (CDS) per transcript, leading to a substantial underestimation of the coding potential of eukaryotes. Here, we present OpenProt, the first database fully endorsing a polycistronic model of eukaryotic genomes to date. OpenProt contains all possible ORFs longer than 30 codons across 10 species, and cumulates supporting evidence such as protein conservation, translation and expression. OpenProt annotates all known proteins (RefProts), novel predicted isoforms (Isoforms) and novel predicted proteins from alternative ORFs (AltProts). It incorporates cutting-edge algorithms to evaluate protein orthology and re-interrogate publicly available ribosome profiling and mass spectrometry datasets, supporting the annotation of thousands of predicted ORFs. The constantly growing database currently cumulates evidence from 87 ribosome profiling and 114 mass spectrometry studies from several species, tissues and cell lines. All data is freely available and downloadable from a web platform (www.openprot.org) supporting a genome browser and advanced queries for each species. Thus, OpenProt enables a more comprehensive landscape of eukaryotic genomes' coding potential
    corecore