10 research outputs found

    Genome-wide computational prediction of tandem gene arrays: application in yeasts

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>This paper describes an efficient <it>in silico </it>method for detecting tandem gene arrays (TGAs) in fully sequenced and compact genomes such as those of prokaryotes or unicellular eukaryotes. The originality of this method lies in the search of protein sequence similarities in the vicinity of each coding sequence, which allows the prediction of tandem duplicated gene copies independently of their functionality.</p> <p>Results</p> <p>Applied to nine hemiascomycete yeast genomes, this method predicts that 2% of the genes are involved in TGAs and gene relics are present in 11% of TGAs. The frequency of TGAs with degenerated gene copies means that a significant fraction of tandem duplicated genes follows the birth-and-death model of evolution. A comparison of sequence identity distributions between sets of homologous gene pairs shows that the different copies of tandem arrayed paralogs are less divergent than copies of dispersed paralogs in yeast genomes. It suggests that paralogs included in tandem structures are more recent or more subject to the gene conversion mechanism than other paralogs.</p> <p>Conclusion</p> <p>The method reported here is a useful computational tool to provide a database of TGAs composed of functional or nonfunctional gene copies. Such a database has obvious applications in the fields of structural and comparative genomics. Notably, a detailed study of the TGA catalog will make it possible to tackle the fundamental questions of the origin and evolution of tandem gene clusters.</p

    Differential evolution of the Saccharomyces cerevisiae DUP240 paralogs and implication of recombination in phylogeny

    Get PDF
    Multigene families are observed in all genomes sequenced so far and are the reflection of key evolutionary mechanisms. The DUP240 family, identified in Saccharomyces cerevisiae strain S288C, is composed of 10 paralogs: seven are organized as two tandem repeats and three are solo ORFs. To investigate the evolution of the three solo paralogs, YAR023c, YCR007c and YHL044w, we performed a comparative analysis between 15 S.cerevisiae strains. These three ORFs are present in all strains and the conservation of synteny indicates that they are not frequently involved in chromosomal reshaping, in contrast to the DUP240 ORFs organized in tandem repeats. Our analysis of nucleotide and amino acid variations indicates that YAR023c and YHL044w fix mutations more easily than YCR007c, although they all belong to the same multigene family. This comparative analysis was also conducted with five arbitrarily chosen Ascomycetes-specific genes and five arbitrarily chosen common genes (genes that have a homolog in at least one non-Ascomycetes organism). Ascomycetes-specific genes appear to be diverging faster than common genes in the S.cerevisiae species, a situation that was previously described between different yeast species. Our results point to the strong contribution, during DNA sequence evolution, of allelic recombination besides nucleotide substitution

    Paleogenomics or the search for remnant duplicated copies of the yeast DUP240 gene family in intergenic areas.

    No full text
    Duplication, resulting in gene redundancy, is well known to be a driving force of evolutionary change. Gene families are therefore useful targets for approaching genome evolution. To address the gene death process, we examined the fate of the 10-member-large S288C DUP240 family in 15 Saccharomyces cerevisiae strains. Using an original three-step method of analysis reported here, both slightly and highly degenerate DUP240 copies, called pseudo-open reading frames (ORFs) and relics, respectively, were detected in strain S288C. It was concluded that two previously annotated ORFs correspond, in fact, to pseudo-ORFs and three additional relics were identified in intergenic areas. Comparative intraspecies analysis of these degenerate DUP240 loci revealed that the two pseudo-ORFs are present in a nondegenerate state in some other strains. This suggests that within a given gene family different loci are the target of the gene erasure process, which is therefore strain dependent. Besides, the variable positions observed indicate that the relic sequence may diverge faster than the flanking regions. All in all, this study shows that short conserved protein motifs provide a useful tool for detecting and accurately mapping degenerate gene remnants. The present results also highlight the strong contribution of comparative genomics for gene relic detection because the possibility of finding short conserved protein motifs in intergenic regions (IRs) largely depends on the choice of the most closely related paralog or ortholog. By mapping new genetic components in previously annotated IRs, our study constitutes a further refinement step in the crucial stage of genome annotation and provides a strategy for retracing ancient chromosomal reshaping events and, hence, for deciphering genome history.historical articlejournal articleresearch support, non-u.s. gov't2005 Sep2005 05 25importe

    An evolutionary scenario for one of the largest yeast gene families.

    No full text
    The DUP gene family of Saccharomyces cerevisiae comprises 23 members that can be divided into two subfamilies--DUP240 and DUP380. The location of the DUP loci suggests that at least three mechanisms were responsible for their genomic dispersion: nonreciprocal translocation at chromosomal ends, tandem duplication and Ty-associated duplication. The data we present here suggest that these nonessential genes encode proteins that facilitate membrane trafficking processes. Dup240 proteins have three conserved domains (C1, C2 and C3) and two predicted transmembrane segments (H1 and H2). A direct repetition of the C1-H1-H2-C2 module is observed in Dup380p sequences. In this article, we propose an evolutionary model to account for the emergence of the two gene subfamilies.journal articleresearch support, non-u.s. gov't2006 Jan2005 11 02importe

    Detection and characterization of megasatellites in orthologous and nonorthologous genes of 21 fungal genomes.

    No full text
    International audienceMegasatellites are large DNA tandem repeats, originally described in Candida glabrata, in protein-coding genes. Most of the genes in which megasatellites are found are of unknown function. In this work, we extended the search for megasatellites to 20 additional completely sequenced fungal genomes and extracted 216 megasatellites in 203 out of 142,121 genes, corresponding to the most exhaustive description of such genetic elements available today. We show that half of the megasatellites detected encode threonine-rich peptides predicted to be intrinsically disordered, suggesting that they may interact with several partners or serve as flexible linkers. Megasatellite motifs were clustered into several families. Their distribution in fungal genes shows that different motifs are found in orthologous genes and similar motifs are found in unrelated genes, suggesting that megasatellite formation or spreading does not necessarily track the evolution of their host genes. Altogether, these results suggest that megasatellites are created and lost during evolution of fungal genomes, probably sharing similar functions, although their primary sequences are not necessarily conserved

    The complete genome of Blastobotrys (Arxula) adeninivorans LS3 - a yeast of biotechnological interest.

    Get PDF
    Background: The industrially important yeast Blastobotrys (Arxula) adeninivorans is an asexual hemiascomycete phylogenetically very distant from Saccharomyces cerevisiae. Its unusual metabolic flexibility allows it to use a wide range of carbon and nitrogen sources, while being thermotolerant, xerotolerant and osmotolerant./nResults: The sequencing of strain LS3 revealed that the nuclear genome of A. adeninivorans is 11.8 Mb long and consists of four chromosomes with regional centromeres. Its closest sequenced relative is Yarrowia lipolytica, although mean conservation of orthologs is low. With 914 introns within 6116 genes, A. adeninivorans is one of the most intron-rich hemiascomycetes sequenced to date. Several large species-specific families appear to result from multiple rounds of segmental duplications of tandem gene arrays, a novel mechanism not yet described in yeasts. An analysis of the genome and its transcriptome revealed enzymes with biotechnological potential, such as two extracellular tannases (Atan1p and Atan2p) of the tannic-acid catabolic route, and a new pathway for the assimilation of n-butanol via butyric aldehyde and butyric acid. Conclusions: The high-quality genome of this species that diverged early in Saccharomycotina will allow further fundamental studies on comparative genomics, evolution and phylogenetics. Protein components of different pathways for carbon and nitrogen source utilization were identified, which so far has remained unexplored in yeast, offering clues for further biotechnological developments. In the course of identifying alternative microorganisms for biotechnological interest, A. adeninivorans has already proved its strengthened competitiveness as a promising cell factory for many more applications.This work was supported in part by funding from the Consortium National de Recherche en Génomique (CNRG) to Génoscope, from CNRS (GDR 2354, Génolevures), ANR (ANR-05-BLAN-0331, GENARISE). The computing framework was supported by the funding of the University of Bordeaux 1, the Aquitaine Région in the program “Génotypage et Génomique Comparée”, the ACI IMPBIO “Génolevures En Ligne” and INRIA. We thank the System and Network Administration team in LaBRI for excellent help and advice. J.A.C. is supported by the PhD Program in Computational Biology of the Instituto Gulbenkian de Ciência, Portugal (sponsored by Fundação Calouste Gulbenkian, Siemens SA, and Fundação para a Ciência e Tecnologia; SFRH/BD/33528/2008). M.C. research was supported by a grant of the Deutscher Akademischer Austauschdienst (DAAD). T.G. research was partly supported by a grant from the Spanish Ministry of Economy and Competitiveness (BIO2012-37161). B.D. is a member of Institut Universitaire de Franc

    Evolutionary Role of Interspecies Hybridization and Genetic Exchanges in Yeasts

    No full text
    corecore