165 research outputs found
Historical contingency and entrenchment in protein evolution under purifying selection
The fitness contribution of an allele at one genetic site may depend on
alleles at other sites, a phenomenon known as epistasis. Epistasis can
profoundly influence the process of evolution in populations under selection,
and can shape the course of protein evolution across divergent species. Whereas
epistasis between adaptive substitutions has been the subject of extensive
study, relatively little is known about epistasis under purifying selection.
Here we use mechanistic models of thermodynamic stability in a ligand-binding
protein to explore the structure of epistatic interactions between
substitutions that fix in protein sequences under purifying selection. We find
that the selection coefficients of mutations that are nearly-neutral when they
fix are highly contingent on the presence of preceding mutations. Conversely,
mutations that are nearly-neutral when they fix are subsequently entrenched due
to epistasis with later substitutions. Our evolutionary model includes
insertions and deletions, as well as point mutations, and so it allows us to
quantify epistasis within each of these classes of mutations, and also to study
the evolution of protein length. We find that protein length remains largely
constant over time, because indels are more deleterious than point mutations.
Our results imply that, even under purifying selection, protein sequence
evolution is highly contingent on history and so it cannot be predicted by the
phenotypic effects of mutations assayed in the wild-type sequence.Comment: 42 pages, 13 figure
The inevitability of unconditionally deleterious substitutions during adaptation
Studies on the genetics of adaptation typically neglect the possibility that
a deleterious mutation might fix. Nonetheless, here we show that, in many
regimes, the first substitution is most often deleterious, even when fitness is
expected to increase in the long term. In particular, we prove that this
phenomenon occurs under weak mutation for any house-of-cards model with an
equilibrium distribution. We find that the same qualitative results hold under
Fisher's geometric model. We also provide a simple intuition for the surprising
prevalence of unconditionally deleterious substitutions during early
adaptation. Importantly, the phenomenon we describe occurs on fitness
landscapes without any local maxima and is therefore distinct from
"valley-crossing". Our results imply that the common practice of ignoring
deleterious substitutions leads to qualitatively incorrect predictions in many
regimes. Our results also have implications for the substitution process at
equilibrium and for the response to a sudden decrease in population size.Comment: Corrected typos and minor errors in Supporting Informatio
Transcriptional errors and the drift barrier
Population genetics predicts that the balance between natural selection and genetic drift is determined by the population size. Species with large population sizes are predicted to have properties governed mainly by selective forces; whereas species with small population sizes should exhibit features governed by mutational processes alone. This “drift-barrier hypothesis” has been successful in explaining extensive variation in genome size, mutation rate, transposable element abundance, and other molecular features across diverse taxa (1⇓–3). However, in PNAS Traverse and Ochman (4) report a striking exception to this theory by showing that transcriptional error rates are nearly equal across several bacterial species with very different population sizes
Heavy metals contaminating the environment of a progressive supranuclear palsy cluster induce tau accumulation and cell death in cultured neurons
Progressive supranuclear palsy (PSP) is a neurodegenerative disorder characterized by the presence of intracellular aggregates of tau protein and neuronal loss leading to cognitive and motor impairment. Occurrence is mostly sporadic, but rare family clusters have been described. Although the etiopathology of PSP is unknown, mutations in the MAPT/tau gene and exposure to environmental toxins can increase the risk of PSP. Here, we used cell models to investigate the potential neurotoxic effects of heavy metals enriched in a highly industrialized region in France with a cluster of sporadic PSP cases. We found that iPSC-derived iNeurons from a MAPT mutation carrier tend to be more sensitive to cell death induced by chromium (Cr) and nickel (Ni) exposure than an isogenic control line. We hypothesize that genetic variations may predispose to neurodegeneration induced by those heavy metals. Furthermore, using an SH-SY5Y neuroblastoma cell line, we showed that both heavy metals induce cell death by an apoptotic mechanism. Interestingly, Cr and Ni treatments increased total and phosphorylated tau levels in both cell types, implicating Cr and Ni exposure in tau pathology. Overall, this study suggests that chromium and nickel could contribute to the pathophysiology of tauopathies such as PSP by promoting tau accumulation and neuronal cell death
Epistasis not needed to explain low dN/dS
An important question in molecular evolution is whether an amino acid that
occurs at a given position makes an independent contribution to fitness, or
whether its effect depends on the state of other loci in the organism's genome,
a phenomenon known as epistasis. In a recent letter to Nature, Breen et al.
(2012) argued that epistasis must be "pervasive throughout protein evolution"
because the observed ratio between the per-site rates of non-synonymous and
synonymous substitutions (dN/dS) is much lower than would be expected in the
absence of epistasis. However, when calculating the expected dN/dS ratio in the
absence of epistasis, Breen et al. assumed that all amino acids observed in a
protein alignment at any particular position have equal fitness. Here, we relax
this unrealistic assumption and show that any dN/dS value can in principle be
achieved at a site, without epistasis. Furthermore, for all nuclear and
chloroplast genes in the Breen et al. dataset, we show that the observed dN/dS
values and the observed patterns of amino acid diversity at each site are
jointly consistent with a non-epistatic model of protein evolution.Comment: This manuscript is in response to "Epistasis as the primary factor in
molecular evolution" by Breen et al. Nature 490, 535-538 (2012
MAVE-NN: learning genotype-phenotype maps from multiplex assays of variant effect
Multiplex assays of variant effect (MAVEs) are a family of methods that includes deep mutational scanning experiments on proteins and massively parallel reporter assays on gene regulatory sequences. Despite their increasing popularity, a general strategy for inferring quantitative models of genotype-phenotype maps from MAVE data is lacking. Here we introduce MAVE-NN, a neural-network-based Python package that implements a broadly applicable information-theoretic framework for learning genotype-phenotype maps-including biophysically interpretable models-from MAVE datasets. We demonstrate MAVE-NN in multiple biological contexts, and highlight the ability of our approach to deconvolve mutational effects from otherwise confounding experimental nonlinearities and noise
Higher-order epistasis and phenotypic prediction
Contemporary high-throughput mutagenesis experiments are providing an increasingly detailed view of the complex patterns of genetic interaction that occur between multiple mutations within a single protein or regulatory element. By simultaneously measuring the effects of thousands of combinations of mutations, these experiments have revealed that the genotype-phenotype relationship typically reflects not only genetic interactions between pairs of sites but also higher-order interactions among larger numbers of sites. However, modeling and understanding these higher-order interactions remains challenging. Here we present a method for reconstructing sequence-to-function mappings from partially observed data that can accommodate all orders of genetic interaction. The main idea is to make predictions for unobserved genotypes that match the type and extent of epistasis found in the observed data. This information on the type and extent of epistasis can be extracted by considering how phenotypic correlations change as a function of mutational distance, which is equivalent to estimating the fraction of phenotypic variance due to each order of genetic interaction (additive, pairwise, three-way, etc.). Using these estimated variance components, we then define an empirical Bayes prior that in expectation matches the observed pattern of epistasis and reconstruct the genotype-phenotype mapping by conducting Gaussian process regression under this prior. To demonstrate the power of this approach, we present an application to the antibody-binding domain GB1 and also provide a detailed exploration of a dataset consisting of high-throughput measurements for the splicing efficiency of human pre-mRNA [Formula: see text] splice sites, for which we also validate our model predictions via additional low-throughput experiments
Predicting evolution and visualizing high-dimensional fitness landscapes
The tempo and mode of an adaptive process is strongly determined by the
structure of the fitness landscape that underlies it. In order to be able to
predict evolutionary outcomes (even on the short term), we must know more about
the nature of realistic fitness landscapes than we do today. For example, in
order to know whether evolution is predominantly taking paths that move upwards
in fitness and along neutral ridges, or else entails a significant number of
valley crossings, we need to be able to visualize these landscapes: we must
determine whether there are peaks in the landscape, where these peaks are
located with respect to one another, and whether evolutionary paths can connect
them. This is a difficult task because genetic fitness landscapes (as opposed
to those based on traits) are high-dimensional, and tools for visualizing such
landscapes are lacking. In this contribution, we focus on the predictability of
evolution on rugged genetic fitness landscapes, and determine that peaks in
such landscapes are highly clustered: high peaks are predominantly close to
other high peaks. As a consequence, the valleys separating such peaks are
shallow and narrow, such that evolutionary trajectories towards the highest
peak in the landscape can be achieved via a series of valley crossingsComment: 12 pages, 7 figures. To appear in "Recent Advances in the Theory and
Application of Fitness Landscapes" (A. Engelbrecht and H. Richter, eds.).
Springer Series in Emergence, Complexity, and Computation, 201
- …