14 research outputs found

    The RNA polymerase III-dependent family of genes in hemiascomycetes: comparative RNomics, decoding strategies, transcription and evolutionary implications

    Get PDF
    We present the first comprehensive analysis of RNA polymerase III (Pol III) transcribed genes in ten yeast genomes. This set includes all tRNA genes (tDNA) and genes coding for SNR6 (U6), SNR52, SCR1 and RPR1 RNA in the nine hemiascomycetes Saccharomyces cerevisiae, Saccharomyces castellii, Candida glabrata, Kluyveromyces waltii, Kluyveromyces lactis, Eremothecium gossypii, Debaryomyces hansenii, Candida albicans, Yarrowia lipolytica and the archiascomycete Schizosaccharomyces pombe. We systematically analysed sequence specificities of tRNA genes, polymorphism, variability of introns, gene redundancy and gene clustering. Analysis of decoding strategies showed that yeasts close to S.cerevisiae use bacterial decoding rules to read the Leu CUN and Arg CGN codons, in contrast to all other known Eukaryotes. In D.hansenii and C.albicans, we identified a novel tDNA-Leu (AAG), reading the Leu CUU/CUC/CUA codons with an unusual G at position 32. A systematic ‘p-distance tree’ using the 60 variable positions of the tRNA molecule revealed that most tDNAs cluster into amino acid-specific sub-trees, suggesting that, within hemiascomycetes, orthologous tDNAs are more closely related than paralogs. We finally determined the bipartite A- and B-box sequences recognized by TFIIIC. These minimal sequences are nearly conserved throughout hemiascomycetes and were satisfactorily retrieved at appropriate locations in other Pol III genes

    Dicistronic tRNA-5S rRNA genes in Yarrowia lipolytica: an alternative TFIIIA-independent way for expression of 5S rRNA genes.

    Get PDF
    International audienceIn eukaryotes, genes transcribed by RNA polymerase III (Pol III) carry their own internal promoters and as such, are transcribed as individual units. Indeed, a very few cases of dicistronic Pol III genes are yet known. In contrast to other hemiascomycetes, 5S rRNA genes of Yarrowia lipolytica are not embedded into the tandemly repeated rDNA units, but appear scattered throughout the genome. We report here an unprecedented genomic organization: 48 over the 108 copies of the 5S rRNA genes are located 3' of tRNA genes. We show that these peculiar tRNA-5S rRNA dicistronic genes are expressed in vitro and in vivo as Pol III transcriptional fusions without the need of the 5S rRNA gene-specific factor TFIIIA, the deletion of which displays a viable phenotype. We also report the existence of a novel putative non-coding Pol III RNA of unknown function about 70 nucleotide-long (RUF70), the 13 genes of which are devoid of internal Pol III promoters and located 3' of the 13 copies of the tDNA-Trp (CCA). All genes embedded in the various dicistronic genes, fused 5S rRNA genes, RUF70 genes and their leader tRNA genes appear to be efficiently transcribed and their products correctly processed in vivo

    The RNA structure alignment ontology

    No full text
    Multiple sequence alignments are powerful tools for understanding the structures, functions, and evolutionary histories of linear biological macromolecules (DNA, RNA, and proteins), and for finding homologs in sequence databases. We address several ontological issues related to RNA sequence alignments that are informed by structure. Multiple sequence alignments are usually shown as two-dimensional (2D) matrices, with rows representing individual sequences, and columns identifying nucleotides from different sequences that correspond structurally, functionally, and/or evolutionarily. However, the requirement that sequences and structures correspond nucleotide-by-nucleotide is unrealistic and hinders representation of important biological relationships. High-throughput sequencing efforts are also rapidly making 2D alignments unmanageable because of vertical and horizontal expansion as more sequences are added. Solving the shortcomings of traditional RNA sequence alignments requires explicit annotation of the meaning of each relationship within the alignment. We introduce the notion of “correspondence,” which is an equivalence relation between RNA elements in sets of sequences as the basis of an RNA alignment ontology. The purpose of this ontology is twofold: first, to enable the development of new representations of RNA data and of software tools that resolve the expansion problems with current RNA sequence alignments, and second, to facilitate the integration of sequence data with secondary and three-dimensional structural information, as well as other experimental information, to create simultaneously more accurate and more exploitable RNA alignments

    Comparative genomics of protoploid Saccharomycetaceae

    Get PDF
    Our knowledge of yeast genomes remains largely dominated by the extensive studies on Saccharomyces cerevisiae and the consequences of its ancestral duplication, leaving the evolution of the entire class of hemiascomycetes only partly explored. We concentrate here on five species of Saccharomycetaceae, a large subdivision of hemiascomycetes, that we call “protoploid” because they diverged from the S. cerevisiae lineage prior to its genome duplication. We determined the complete genome sequences of three of these species: Kluyveromyces (Lachancea) thermotolerans and Saccharomyces (Lachancea) kluyveri (two members of the newly described Lachancea clade), and Zygosaccharomyces rouxii. We included in our comparisons the previously available sequences of Kluyveromyces lactis and Ashbya (Eremothecium) gossypii. Despite their broad evolutionary range and significant individual variations in each lineage, the five protoploid Saccharomycetaceae share a core repertoire of approximately 3300 protein families and a high degree of conserved synteny. Synteny blocks were used to define gene orthology and to infer ancestors. Far from representing minimal genomes without redundancy, the five protoploid yeasts contain numerous copies of paralogous genes, either dispersed or in tandem arrays, that, altogether, constitute a third of each genome. Ancient, conserved paralogs as well as novel, lineage-specific paralogs were identified
    corecore