157 research outputs found

    An Ancient Evolutionary Origin of Genes Associated with Human Genetic Diseases

    Get PDF
    Several thousand genes in the human genome have been linked to a heritable genetic disease. The majority of these appear to be nonessential genes (i.e., are not embryonically lethal when inactivated), and one could therefore speculate that they are late additions in the evolutionary lineage toward humans. Contrary to this expectation, we find that they are in fact significantly overrepresented among the genes that have emerged during the early evolution of the metazoa. Using a phylostratigraphic approach, we have studied the evolutionary emergence of such genes at 19 phylogenetic levels. The majority of disease genes was already present in the eukaryotic ancestor, and the second largest number has arisen around the time of evolution of multicellularity. Conversely, genes specific to the mammalian lineage are highly underrepresented. Hence, genes involved in genetic diseases are not simply a random subset of all genes in the genome but are biased toward ancient genes

    Evolution of orphan genes in Drosophila

    Get PDF
    Orphan genes are protein coding regions that have no recognizable homologue in distantly related species. A substantial fraction of coding regions in any genome sequenced so far consists of such orphan genes, but their evolutionary and functional significance is not understood. A re-analysis of the Drosophila melanogaster proteome is presented that shows that there are still between 26 - 29% of all proteins without a significant match with non-insect sequences. Therefore, neither the growth of the database nor the re-annotations have significantly changed the proportion of orphans in the Drosophila genome over time. In addition, it was shown that these orphans are significantly underrepresented in the current genetic analysis. To analyse directly the evolutionary characteristics of orphan genes in Drosophila, 774 sequences were compared between cDNAs retrieved from two D. yakuba libraries (embryo and adult) and their corresponding D. melanogaster orthologues. Analysis of substitution rates shows that recovered orphans evolve on average more than three times faster than non-orphan genes, although the width of the evolutionary rate distribution is similar for both classes. In particular, some orphan genes show very low substitution rates, which are comparable to otherwise highly, conserved genes. A general model for orphan gene evolution is proposed that takes these large rate differences into account and suggests that they are caused by episodic phases of fast and slow divergence. Besides the result, that orphans are under-represented among genetically studied genes, additional findings suggest that orphan genes have less obvious phenotypes. For example, in the complete sample of the recovered cDNAs higher frequency of genetically studied genes was found among slow evolving genes, what supports the proposed hypothesis that functionally more important genes with obvious phenotypes have lower evolutionary rates. Interestingly, such relationship is lacking if only orphans are analysed. Additionally, orphans are over-represented among genes related to olfaction, hormonal activity, puparial adhesion, egg membrane structure and perception and response to abiotic stimulus. It is reasonable to expect for all of these functions to be involved in specific ecological adaptations that change easily over time, and accordingly to have mutant phenotypes which are difficult to detect. Finally, comparison between stages shows that the cDNA library from adults yields twice as many orphan genes than the one from embryos. An analysis of only genes having stage specific expression reveals a similar figure and together with lower evolutionary rate of embryo transcripts suggests a higher constraint on use of orphan genes in embryos. Furthermore, expression of embryo orphans is more often spatially restricted compared to a random sample of genes what shows that they act in more localised rather then ubiquitous manner. Taken together, the general characteristics of orphan genes in Drosophila suggest that they may be involved in the evolution of adaptive traits and that slow evolving orphan genes may be particularly interesting candidate genes for identifying lineage specific adaptations

    Phylostratigraphic profiles reveal a deep evolutionary history of the vertebrate head sensory systems

    Get PDF
    Background: The vertebrate head is a highly derived trait with a heavy concentration of sophisticated sensory organs that allow complex behaviour in this lineage. The head sensory structures arise during vertebrate development from cranial placodes and the neural crest. It is generally thought that derivatives of these ectodermal embryonic tissues played a central role in the evolutionary transition at the onset of vertebrates. Despite the obvious importance of head sensory organs for vertebrate biology, their evolutionary history is still uncertain. Results: To give a fresh perspective on the adaptive history of the vertebrate head sensory organs, we applied genomic phylostratigraphy to large-scale in situ expression data of the developing zebrafish Danio rerio. Contrary to traditional predictions, we found that dominant adaptive signals in the analyzed sensory structures largely precede the evolutionary advent of vertebrates. The leading adaptive signals at the bilaterian-chordate transition suggested that the visual system was the first sensory structure to evolve. The olfactory, vestibuloauditory, and lateral line sensory organs displayed a strong link with the urochordate-vertebrate ancestor. The only structures that qualified as genuine vertebrate innovations were the neural crest derivatives, trigeminal ganglion and adenohypophysis. We also found evidence that the cranial placodes evolved before the neural crest despite their proposed embryological relatedness. Conclusions: Taken together, our findings reveal pre-vertebrate roots and a stepwise adaptive history of the vertebrate sensory systems. This study also underscores that large genomic and expression datasets are rich sources of macroevolutionary information that can be recovered by phylostratigraphic mining

    Broker Genes in Human Disease

    Get PDF
    Genes that underlie human disease are important subjects of systems biology research. In the present study, we demonstrate that Mendelian and complex disease genes have distinct and consistent protein–protein interaction (PPI) properties. We show that five different network properties can be reduced to two independent metrics when applied to the human PPI network. These two metrics largely coincide with the degree (number of connections) and the clustering coefficient (the number of connections among the neighbors of a particular protein). We demonstrate that disease genes have simultaneously unusually high degree and unusually low clustering coefficient. Such genes can be described as brokers in that they connect many proteins that would not be connected otherwise. We show that these results are robust to the effect of gene age and inspection bias variation. Notably, genes identified in genome-wide association study (GWAS) have network patterns that are almost indistinguishable from the network patterns of nondisease genes and significantly different from the network patterns of complex disease genes identified through non-GWAS means. This suggests either that GWAS focused on a distinct set of diseases associated with an unusual set of genes or that mapping of GWAS-identified single nucleotide polymorphisms onto the causally affected neighboring genes is error prone

    Relaxed Purifying Selection and Possibly High Rate of Adaptation in Primate Lineage-Specific Genes

    Get PDF
    Genes in the same organism vary in the time since their evolutionary origin. Without horizontal gene transfer, young genes are necessarily restricted to a few closely related species, whereas old genes can be broadly distributed across the phylogeny. It has been shown that young genes evolve faster than old genes; however, the evolutionary forces responsible for this pattern remain obscure. Here, we classify human–chimp protein-coding genes into different age classes, according to the breath of their phylogenetic distribution. We estimate the strength of purifying selection and the rate of adaptive selection for genes in different age classes. We find that older genes carry fewer and less frequent nonsynonymous single-nucleotide polymorphisms than younger genes suggesting that older genes experience a stronger purifying selection at the protein-coding level. We infer the distribution of fitness effects of new deleterious mutations and find that older genes have proportionally more slightly deleterious mutations and fewer nearly neutral mutations than younger genes. To investigate the role of adaptive selection of genes in different age classes, we determine the selection coefficient (γ = 2Nes) of genes using the MKPRF approach and estimate the ratio of the rate of adaptive nonsynonymous substitution to synonymous substitution (ωA) using the DoFE method. Although the proportion of positively selected genes (γ > 0) is significantly higher in younger genes, we find no correlation between ωA and gene age. Collectively, these results provide strong evidence that younger genes are subject to weaker purifying selection and more tenuous evidence that they also undergo adaptive evolution more frequently

    The life cycle of Drosophila orphan genes

    Get PDF
    Orphans are genes restricted to a single phylogenetic lineage and emerge at high rates. While this predicts an accumulation of genes, the gene number has remained remarkably constant through evolution. This paradox has not yet been resolved. Because orphan genes have been mainly analyzed over long evolutionary time scales, orphan loss has remained unexplored. Here we study the patterns of orphan turnover among close relatives in the Drosophila obscura group. We show that orphans are not only emerging at a high rate, but that they are also rapidly lost. Interestingly, recently emerged orphans are more likely to be lost than older ones. Furthermore, highly expressed orphans with a strong male-bias are more likely to be retained. Since both lost and retained orphans show similar evolutionary signatures of functional conservation, we propose that orphan loss is not driven by high rates of sequence evolution, but reflects lineage specific functional requirements.Comment: 47 pages, 19 figure

    Similarly Strong Purifying Selection Acts on Human Disease Genes of All Evolutionary Ages

    Get PDF
    A number of studies have showed that recently created genes differ from the genes created in deep evolutionary past in many aspects. Here, we determined the age of emergence and propensity for gene loss (PGL) of all human protein–coding genes and compared disease genes with non-disease genes in terms of their evolutionary rate, strength of purifying selection, mRNA expression, and genetic redundancy. The older and the less prone to loss, non-disease genes have been evolving 1.5- to 3-fold slower between humans and chimps than young non-disease genes, whereas Mendelian disease genes have been evolving very slowly regardless of their ages and PGL. Complex disease genes showed an intermediate pattern. Disease genes also have higher mRNA expression heterogeneity across multiple tissues than non-disease genes regardless of age and PGL. Young and middle-aged disease genes have fewer similar paralogs as non-disease genes of the same age. We reasoned that genes were more likely to be involved in human disease if they were under a strong functional constraint, expressed heterogeneously across tissues, and lacked genetic redundancy. Young human genes that have been evolving under strong constraint between humans and chimps might also be enriched for genes that encode important primate or even human-specific functions

    CO-phylum: An Assembly-Free Phylogenomic Approach for Close Related Organisms

    Full text link
    Phylogenomic approaches developed thus far are either too time-consuming or lack a solid evolutionary basis. Moreover, no phylogenomic approach is capable of constructing a tree directly from unassembled raw sequencing data. A new phylogenomic method, CO-phylum, is developed to alleviate these flaws. CO-phylum can generate a high-resolution and highly accurate tree using complete genome or unassembled sequencing data of close related organisms, in addition, CO-phylum distance is almost linear with p-distance.Comment: 21 pages, 6 figure

    Network of Cancer Genes: a web resource to analyze duplicability, orthology and network properties of cancer genes

    Get PDF
    The Network of Cancer Genes (NCG) collects and integrates data on 736 human genes that are mutated in various types of cancer. For each gene, NCG provides information on duplicability, orthology, evolutionary appearance and topological properties of the encoded protein in a comprehensive version of the human protein-protein interaction network. NCG also stores information on all primary interactors of cancer proteins, thus providing a complete overview of 5357 proteins that constitute direct and indirect determinants of human cancer. With the constant delivery of results from the mutational screenings of cancer genomes, NCG represents a versatile resource for retrieving detailed information on particular cancer genes, as well as for identifying common properties of precompiled lists of cancer genes. NCG is freely available at: http://bio.ifom-ieo-campus.it/ncg
    • …
    corecore