20 research outputs found

    Estimating the rate of irreversibility in protein evolution

    Get PDF
    Whether or not evolutionary change is inherently irreversible remains a controversial topic. Some examples of evolutionary irreversibility are known; however, this question has not been comprehensively addressed at the molecular level. Here, we use data from 221 human genes with known pathogenic mutations to estimate the rate of irreversibility in protein evolution. For these genes, we reconstruct ancestral amino acid sequences along the mammalian phylogeny and identify ancestral amino acid states that match known pathogenic mutations. Such cases represent inherent evolutionary irreversibility because, at the present moment, reversals to these ancestral amino acid states are impossible for the human lineage. We estimate that approximately 10% of all amino acid substitutions along the mammalian phylogeny are irreversible, such that a return to the ancestral amino acid state would lead to a pathogenic phenotype. For a subset of 51 genes with high rates of irreversibility, as much as 40% of all amino acid evolution was estimated to be irreversible. Because pathogenic phenotypes do not resemble ancestral phenotypes, the molecular nature of the high rate of irreversibility in proteins is best explained by evolution with a high prevalence of compensatory, epistatic interactions between amino acid sites. Under such mode of protein evolution, once an amino acid substitution is fixed, the probability of its reversal declines as the protein sequence accumulates changes that affect the phenotypic manifestation of the ancestral state. The prevalence of epistasis in evolution indicates that the observed high rate of irreversibility in protein evolution is an inherent property of protein structure and function.This work was supported by Plan Nacional grant BFU2009-09271 from the Spanish Ministry of Science and Innovation and by FPU (FormaciĂłn del Profesorado Universitario) program grant AP2008-01888 from the Spanish Ministry of Education to O.S

    Estimating the rate of irreversibility in protein evolution

    No full text
    Whether or not evolutionary change is inherently irreversible remains a controversial topic. Some examples of evolutionary irreversibility are known; however, this question has not been comprehensively addressed at the molecular level. Here, we use data from 221 human genes with known pathogenic mutations to estimate the rate of irreversibility in protein evolution. For these genes, we reconstruct ancestral amino acid sequences along the mammalian phylogeny and identify ancestral amino acid states that match known pathogenic mutations. Such cases represent inherent evolutionary irreversibility because, at the present moment, reversals to these ancestral amino acid states are impossible for the human lineage. We estimate that approximately 10% of all amino acid substitutions along the mammalian phylogeny are irreversible, such that a return to the ancestral amino acid state would lead to a pathogenic phenotype. For a subset of 51 genes with high rates of irreversibility, as much as 40% of all amino acid evolution was estimated to be irreversible. Because pathogenic phenotypes do not resemble ancestral phenotypes, the molecular nature of the high rate of irreversibility in proteins is best explained by evolution with a high prevalence of compensatory, epistatic interactions between amino acid sites. Under such mode of protein evolution, once an amino acid substitution is fixed, the probability of its reversal declines as the protein sequence accumulates changes that affect the phenotypic manifestation of the ancestral state. The prevalence of epistasis in evolution indicates that the observed high rate of irreversibility in protein evolution is an inherent property of protein structure and function.This work was supported by Plan Nacional grant BFU2009-09271 from the Spanish Ministry of Science and Innovation and by FPU (FormaciĂłn del Profesorado Universitario) program grant AP2008-01888 from the Spanish Ministry of Education to O.S

    Rate of sequence divergence under constant selection

    Get PDF
    BACKGROUND: Divergence of two independently evolving sequences that originated from a common ancestor can be described by two parameters, the asymptotic level of divergence E and the rate r at which this level of divergence is approached. Constant negative selection impedes allele replacements and, therefore, is routinely assumed to decelerate sequence divergence. However, its impact on E and on r has not been formally investigated. RESULTS: Strong selection that favors only one allele can make E arbitrarily small and r arbitrarily large. In contrast, in the case of 4 possible alleles and equal mutation rates, the lowest value of r, attained when two alleles confer equal fitnesses and the other two are strongly deleterious, is only two times lower than its value under selective neutrality. CONCLUSIONS: Constant selection can strongly constrain the level of sequence divergence, but cannot reduce substantially the rate at which this level is approached. In particular, under any constant selection the divergence of sequences that accumulated one substitution per neutral site since their origin from the common ancestor must already constitute at least one half of the asymptotic divergence at sites under such selectio

    Stop codons in bacteria are not selectively equivalent

    Get PDF
    Background: The evolution and genomic stop codon frequencies have not been rigorously studied with the exception of coding of non-canonical amino acids. Here we study the rate of evolution and frequency distribution of stop codons in bacterial genomes. Results: We show that in bacteria stop codons evolve slower than synonymous sites, suggesting the action of weak negative selection. However, the frequency of stop codons relative to genomic nucleotide content indicated that this selection regime is not straightforward. The frequency of TAA and TGA stop codons is GC-content dependent, with TAA decreasing and TGA increasing with GC-content, while TAG frequency is independent of GC-content. Applying a formal, analytical model to these data we found that the relationship between stop codon frequencies and nucleotide content cannot be explained by mutational biases or selection on nucleotide content. However, with weak nucleotide content-dependent selection on TAG, -0.5  16% TGA has a higher fitness than TAG. Conclusions: Our data indicate that TAG codon is universally suboptimal in the bacterial lineage, such that TAA is likely to be the preferred stop codon for low GC content while the TGA is the preferred stop codon for high GC content. The optimization of stop codon usage may therefore be useful in genome engineering or gene expression optimization applications.The work has been supported by a Plan Nacional grant from the Spanish Ministry of Science and Innovation, EMBO Young Investigator and Howard Hughes Medical Institute International Early Career Scientist awards

    Stop codons in bacteria are not selectively equivalent

    No full text
    Background: The evolution and genomic stop codon frequencies have not been rigorously studied with the exception of coding of non-canonical amino acids. Here we study the rate of evolution and frequency distribution of stop codons in bacterial genomes. Results: We show that in bacteria stop codons evolve slower than synonymous sites, suggesting the action of weak negative selection. However, the frequency of stop codons relative to genomic nucleotide content indicated that this selection regime is not straightforward. The frequency of TAA and TGA stop codons is GC-content dependent, with TAA decreasing and TGA increasing with GC-content, while TAG frequency is independent of GC-content. Applying a formal, analytical model to these data we found that the relationship between stop codon frequencies and nucleotide content cannot be explained by mutational biases or selection on nucleotide content. However, with weak nucleotide content-dependent selection on TAG, -0.5  16% TGA has a higher fitness than TAG. Conclusions: Our data indicate that TAG codon is universally suboptimal in the bacterial lineage, such that TAA is likely to be the preferred stop codon for low GC content while the TGA is the preferred stop codon for high GC content. The optimization of stop codon usage may therefore be useful in genome engineering or gene expression optimization applications.The work has been supported by a Plan Nacional grant from the Spanish Ministry of Science and Innovation, EMBO Young Investigator and Howard Hughes Medical Institute International Early Career Scientist awards

    Two metagenomes from late pleistocene northeast siberian permafrost

    No full text
    The present study reports metagenomic shotgun sequencing of microbial communities of two ancient permafrost horizons of the Russian Arctic. Results demonstrate a significant difference in microbial community structure of the analyzed samples in general and microorganisms of the methane cycle in particular

    Two metagenomes from late pleistocene northeast siberian permafrost

    No full text
    The present study reports metagenomic shotgun sequencing of microbial communities of two ancient permafrost horizons of the Russian Arctic. Results demonstrate a significant difference in microbial community structure of the analyzed samples in general and microorganisms of the methane cycle in particular

    The ctenophore genome and the evolutionary origins of neural systems

    No full text
    The origins of neural systems remain unresolved. In contrast to other basal metazoans, ctenophores (comb jellies) have both complex nervous and mesoderm-derived muscular systems. These holoplanktonic predators also have sophisticated ciliated locomotion, behaviour and distinct development. Here we present the draft genome of Pleurobrachia bachei, Pacific sea gooseberry, together with ten other ctenophore transcriptomes, and show that they are remarkably distinct from other animal genomes in their content of neurogenic, immune and developmental genes. Our integrative analyses place Ctenophora as the earliest lineage within Metazoa. This hypothesis is supported by comparative analysis of multiple gene families, including the apparent absence of HOX genes, canonical microRNA machinery, and reduced immune complement in ctenophores. Although two distinct nervous systems are well recognized in ctenophores, many bilaterian neuron-specific genes and genes of 'classical' neurotransmitter pathways either are absent or, if present, are not expressed in neurons. Our metabolomic and physiological data are consistent with the hypothesis that ctenophore neural systems, and possibly muscle specification, evolved independently from those in other animals.This work was supported by NSF (NSF-0744649 and NSF CNS-0821622 to L.L.M.; NSF CHE-1111705 to J.V.S.), NIH (1R01GM097502, R01MH097062, R21RR025699 and 5R21DA030118 to L.L.M.; P30 DA018310 to J.V.S.; R01 AG029360 and 1S10RR027052 to E.I.R.), NASA/nNNX13AJ31G (to K.M.H., L.L.M. and K.M.K.), NSERC 458115 and 211598 (J.P.R.), University of Florida Opportunity Funds/McKnight Brain Research and Florida Biodiversity Institute (L.L.M.), Rostock Inc./A.V. Chikunov (E.I.R.), grant from Russian Federation Government 14.B25.31.0033 (Resolution No.220) (E.I.R.). F.A.K., I.S.P. and R.D.were supported by HHMI(55007424),EMBO and MINECO(BFU2012-31329 and Sev-2012-0208). Contributions of AU Marine Biology Program 117 and Molette laboratory 22

    Metagenomic analyses of the late Pleistocene permafrost – additional tools for reconstruction of environmental conditions

    No full text
    A comparative analysis of the metagenomes from two 30 000-year-old permafrost samples, one of lake-alluvial origin and the other from late Pleistocene Ice Complex sediments, revealed significant differences within microbial communities. The late Pleistocene Ice Complex sediments (which have been characterized by the absence of methane with lower values of redox potential and Fe2+ content) showed a low abundance of methanogenic archaea and enzymes from both the carbon and nitrogen cycles, but a higher abundance of enzymes associated with the sulfur cycle. The metagenomic and geochemical analyses described in the paper provide evidence that the formation of the sampled late Pleistocene Ice Complex sediments likely took place under much more aerobic conditions than lake-alluvial sediments.This work was supported by grants from the Russian Scientific Fund (14-14-01115) to Elizaveta Rivkina; from the National Science Foundation (DEB-1442262) to Tatiana Vishnivetskaya; and from the HHMI International Early Career Scientist Program (55007424), the EMBO Young Investigator Programme, MINECO (BFU2012-31329 and Sev-2012-0208), and the AGAUR program (2014 SGR 0974) to Fyodor Kondrashov. Support from the Russian Scientific Fund (14-14-01115) was allocated for sample collection, gDNA isolation, and analysis of metagenomic data

    Negative selection in tumor genome evolution acts on essential cellular functions and the immunopeptidome

    No full text
    Background: Natural selection shapes cancer genomes. Previous studies used signatures of positive selection to identify genes driving malignant transformation. However, the contribution of negative selection against somatic mutations that affect essential tumor functions or specific domains remains a controversial topic. Results: Here, we analyze 7546 individual exomes from 26 tumor types from TCGA data to explore the portion of the cancer exome under negative selection. Although we find most of the genes neutrally evolving in a pan-cancer framework, we identify essential cancer genes and immune-exposed protein regions under significant negative selection. Moreover, our simulations suggest that the amount of negative selection is underestimated. We therefore choose an empirical approach to identify genes, functions, and protein regions under negative selection. We find that expression and mutation status of negatively selected genes is indicative of patient survival. Processes that are most strongly conserved are those that play fundamental cellular roles such as protein synthesis, glucose metabolism, and molecular transport. Intriguingly, we observe strong signals of selection in the immunopeptidome and proteins controlling peptide exposition, highlighting the importance of immune surveillance evasion. Additionally, tumor type-specific immune activity correlates with the strength of negative selection on human epitopes. Conclusions: In summary, our results show that negative selection is a hallmark of cell essentiality and immune response in cancer. The functional domains identified could be exploited therapeutically, ultimately allowing for the development of novel cancer treatments.The research leading to these results received funding from the Spanish Ministry of Economy—, Industry and Competitiveness (Plan Nacional BIO2012-39754, BFU2012-31329 and BFU2015-68723-P and to the EMBL partnership), “Centro de Excelencia Severo Ochoa 2013–2017,” SEV-2012–0208, the European Union Seventh Framework Programme (FP7/2007–2013) under grant agreement nº. HEALTH-F4-2011–278568 (PRIMES), the European Fund for Regional Development (EFRD), European Union’s Horizon 2020 research and innovation programme under grant agreement Nº 635290 (PanCanRisk), CERCA Programme / Generalitat de Catalunya, the HHMI International Early Career Scientist Program (55007424), Secretaria d’Universitats i Recerca del Departament d’Economia i Coneixement de la Generalitat’s AGAUR program (2014 SGR 0974), and the European Research Council under the European Union’s Seventh Framework Programme (FP7/2007-2013, ERC grant agreement 335980_EinME). LZ has been supported by the International PhD scholarship program of La Caixa at CRG and MS by the German Research Foundation (SCHA 1933/1-1)