165 research outputs found

    Historical contingency and entrenchment in protein evolution under purifying selection

    Get PDF
    The fitness contribution of an allele at one genetic site may depend on alleles at other sites, a phenomenon known as epistasis. Epistasis can profoundly influence the process of evolution in populations under selection, and can shape the course of protein evolution across divergent species. Whereas epistasis between adaptive substitutions has been the subject of extensive study, relatively little is known about epistasis under purifying selection. Here we use mechanistic models of thermodynamic stability in a ligand-binding protein to explore the structure of epistatic interactions between substitutions that fix in protein sequences under purifying selection. We find that the selection coefficients of mutations that are nearly-neutral when they fix are highly contingent on the presence of preceding mutations. Conversely, mutations that are nearly-neutral when they fix are subsequently entrenched due to epistasis with later substitutions. Our evolutionary model includes insertions and deletions, as well as point mutations, and so it allows us to quantify epistasis within each of these classes of mutations, and also to study the evolution of protein length. We find that protein length remains largely constant over time, because indels are more deleterious than point mutations. Our results imply that, even under purifying selection, protein sequence evolution is highly contingent on history and so it cannot be predicted by the phenotypic effects of mutations assayed in the wild-type sequence.Comment: 42 pages, 13 figure

    The inevitability of unconditionally deleterious substitutions during adaptation

    Full text link
    Studies on the genetics of adaptation typically neglect the possibility that a deleterious mutation might fix. Nonetheless, here we show that, in many regimes, the first substitution is most often deleterious, even when fitness is expected to increase in the long term. In particular, we prove that this phenomenon occurs under weak mutation for any house-of-cards model with an equilibrium distribution. We find that the same qualitative results hold under Fisher's geometric model. We also provide a simple intuition for the surprising prevalence of unconditionally deleterious substitutions during early adaptation. Importantly, the phenomenon we describe occurs on fitness landscapes without any local maxima and is therefore distinct from "valley-crossing". Our results imply that the common practice of ignoring deleterious substitutions leads to qualitatively incorrect predictions in many regimes. Our results also have implications for the substitution process at equilibrium and for the response to a sudden decrease in population size.Comment: Corrected typos and minor errors in Supporting Informatio

    Transcriptional errors and the drift barrier

    Get PDF
    Population genetics predicts that the balance between natural selection and genetic drift is determined by the population size. Species with large population sizes are predicted to have properties governed mainly by selective forces; whereas species with small population sizes should exhibit features governed by mutational processes alone. This “drift-barrier hypothesis” has been successful in explaining extensive variation in genome size, mutation rate, transposable element abundance, and other molecular features across diverse taxa (1⇓–3). However, in PNAS Traverse and Ochman (4) report a striking exception to this theory by showing that transcriptional error rates are nearly equal across several bacterial species with very different population sizes

    Heavy metals contaminating the environment of a progressive supranuclear palsy cluster induce tau accumulation and cell death in cultured neurons

    Get PDF
    Progressive supranuclear palsy (PSP) is a neurodegenerative disorder characterized by the presence of intracellular aggregates of tau protein and neuronal loss leading to cognitive and motor impairment. Occurrence is mostly sporadic, but rare family clusters have been described. Although the etiopathology of PSP is unknown, mutations in the MAPT/tau gene and exposure to environmental toxins can increase the risk of PSP. Here, we used cell models to investigate the potential neurotoxic effects of heavy metals enriched in a highly industrialized region in France with a cluster of sporadic PSP cases. We found that iPSC-derived iNeurons from a MAPT mutation carrier tend to be more sensitive to cell death induced by chromium (Cr) and nickel (Ni) exposure than an isogenic control line. We hypothesize that genetic variations may predispose to neurodegeneration induced by those heavy metals. Furthermore, using an SH-SY5Y neuroblastoma cell line, we showed that both heavy metals induce cell death by an apoptotic mechanism. Interestingly, Cr and Ni treatments increased total and phosphorylated tau levels in both cell types, implicating Cr and Ni exposure in tau pathology. Overall, this study suggests that chromium and nickel could contribute to the pathophysiology of tauopathies such as PSP by promoting tau accumulation and neuronal cell death

    Epistasis not needed to explain low dN/dS

    Full text link
    An important question in molecular evolution is whether an amino acid that occurs at a given position makes an independent contribution to fitness, or whether its effect depends on the state of other loci in the organism's genome, a phenomenon known as epistasis. In a recent letter to Nature, Breen et al. (2012) argued that epistasis must be "pervasive throughout protein evolution" because the observed ratio between the per-site rates of non-synonymous and synonymous substitutions (dN/dS) is much lower than would be expected in the absence of epistasis. However, when calculating the expected dN/dS ratio in the absence of epistasis, Breen et al. assumed that all amino acids observed in a protein alignment at any particular position have equal fitness. Here, we relax this unrealistic assumption and show that any dN/dS value can in principle be achieved at a site, without epistasis. Furthermore, for all nuclear and chloroplast genes in the Breen et al. dataset, we show that the observed dN/dS values and the observed patterns of amino acid diversity at each site are jointly consistent with a non-epistatic model of protein evolution.Comment: This manuscript is in response to "Epistasis as the primary factor in molecular evolution" by Breen et al. Nature 490, 535-538 (2012

    MAVE-NN: learning genotype-phenotype maps from multiplex assays of variant effect

    Get PDF
    Multiplex assays of variant effect (MAVEs) are a family of methods that includes deep mutational scanning experiments on proteins and massively parallel reporter assays on gene regulatory sequences. Despite their increasing popularity, a general strategy for inferring quantitative models of genotype-phenotype maps from MAVE data is lacking. Here we introduce MAVE-NN, a neural-network-based Python package that implements a broadly applicable information-theoretic framework for learning genotype-phenotype maps-including biophysically interpretable models-from MAVE datasets. We demonstrate MAVE-NN in multiple biological contexts, and highlight the ability of our approach to deconvolve mutational effects from otherwise confounding experimental nonlinearities and noise

    Higher-order epistasis and phenotypic prediction

    Get PDF
    Contemporary high-throughput mutagenesis experiments are providing an increasingly detailed view of the complex patterns of genetic interaction that occur between multiple mutations within a single protein or regulatory element. By simultaneously measuring the effects of thousands of combinations of mutations, these experiments have revealed that the genotype-phenotype relationship typically reflects not only genetic interactions between pairs of sites but also higher-order interactions among larger numbers of sites. However, modeling and understanding these higher-order interactions remains challenging. Here we present a method for reconstructing sequence-to-function mappings from partially observed data that can accommodate all orders of genetic interaction. The main idea is to make predictions for unobserved genotypes that match the type and extent of epistasis found in the observed data. This information on the type and extent of epistasis can be extracted by considering how phenotypic correlations change as a function of mutational distance, which is equivalent to estimating the fraction of phenotypic variance due to each order of genetic interaction (additive, pairwise, three-way, etc.). Using these estimated variance components, we then define an empirical Bayes prior that in expectation matches the observed pattern of epistasis and reconstruct the genotype-phenotype mapping by conducting Gaussian process regression under this prior. To demonstrate the power of this approach, we present an application to the antibody-binding domain GB1 and also provide a detailed exploration of a dataset consisting of high-throughput measurements for the splicing efficiency of human pre-mRNA [Formula: see text] splice sites, for which we also validate our model predictions via additional low-throughput experiments

    Predicting evolution and visualizing high-dimensional fitness landscapes

    Full text link
    The tempo and mode of an adaptive process is strongly determined by the structure of the fitness landscape that underlies it. In order to be able to predict evolutionary outcomes (even on the short term), we must know more about the nature of realistic fitness landscapes than we do today. For example, in order to know whether evolution is predominantly taking paths that move upwards in fitness and along neutral ridges, or else entails a significant number of valley crossings, we need to be able to visualize these landscapes: we must determine whether there are peaks in the landscape, where these peaks are located with respect to one another, and whether evolutionary paths can connect them. This is a difficult task because genetic fitness landscapes (as opposed to those based on traits) are high-dimensional, and tools for visualizing such landscapes are lacking. In this contribution, we focus on the predictability of evolution on rugged genetic fitness landscapes, and determine that peaks in such landscapes are highly clustered: high peaks are predominantly close to other high peaks. As a consequence, the valleys separating such peaks are shallow and narrow, such that evolutionary trajectories towards the highest peak in the landscape can be achieved via a series of valley crossingsComment: 12 pages, 7 figures. To appear in "Recent Advances in the Theory and Application of Fitness Landscapes" (A. Engelbrecht and H. Richter, eds.). Springer Series in Emergence, Complexity, and Computation, 201
    corecore