25 research outputs found
Simon-Ando decomposability and fitness landscapes
In this paper, we investigate fitness landscapes (under point mutation and recombination) from the standpoint of whether the induced evolutionary dynamics have a “fast-slow” time scale associated with the differences in relaxation time between local quasi-equilibria and the global equilibrium. This dynamical hevavior has been formally described in the econometrics literature in terms of the spectral properties of the appropriate operator matrices by Simon and Ando (Econometrica 29 (1961) 111), and we use the relations they derive to ask which fitness functions and mutation/recombination operators satisfy these properties. It turns out that quite a wide range of landscapes satisfy the condition (at least trivially) under point mutation given a sufficiently low mutation rate, while the property appears to be difficult to satisfy under genetic recombination. In spite of the fact that Simon-Ando decomposability can be realized over fairly wide range of parameters, it imposes a number of restriction on which landscape partitionings are possible. For these reasons, the Simon-Ando formalism does not appear to be applicable to other forms of decomposition and aggregation of variables that are important in evolutionary systems
Exact Results for Amplitude Spectra of Fitness Landscapes
Starting from fitness correlation functions, we calculate exact expressions
for the amplitude spectra of fitness landscapes as defined by P.F. Stadler [J.
Math. Chem. 20, 1 (1996)] for common landscape models, including Kauffman's
NK-model, rough Mount Fuji landscapes and general linear superpositions of such
landscapes. We further show that correlations decaying exponentially with the
Hamming distance yield exponentially decaying spectra similar to those reported
recently for a model of molecular signal transduction. Finally, we compare our
results for the model systems to the spectra of various experimentally measured
fitness landscapes. We claim that our analytical results should be helpful when
trying to interpret empirical data and guide the search for improved fitness
landscape models.Comment: 13 pages, 5 figures; revised and final versio
Fundamental Properties of the Evolution of Mutational Robustness
Evolution on neutral networks of genotypes has been found in models to
concentrate on genotypes with high mutational robustness, to a degree
determined by the topology of the network. Here analysis is generalized beyond
neutral networks to arbitrary selection and parent-offspring transmission. In
this larger realm, geometric features determine mutational robustness: the
alignment of fitness with the orthogonalized eigenvectors of the mutation
matrix weighted by their eigenvalues. "House of cards" mutation is found to
preclude the evolution of mutational robustness. Genetic load is shown to
increase with increasing mutation in arbitrary single and multiple locus
fitness landscapes. The rate of decrease in population fitness can never grow
as mutation rates get higher, showing that "error catastrophes" for genotype
frequencies never cause precipitous losses of population fitness. The
"inclusive inheritance" approach taken here naturally extends these results to
a new concept of dispersal robustness.Comment: 17 pages, 1 figur
Epistasis in a Model of Molecular Signal Transduction
Biological functions typically involve complex interacting molecular networks, with numerous feedback and regulation loops. How the properties of the system are affected when one, or several of its parts are modified is a question of fundamental interest, with numerous implications for the way we study and understand biological processes and treat diseases. This question can be rephrased in terms of relating genotypes to phenotypes: to what extent does the effect of a genetic variation at one locus depend on genetic variation at all other loci? Systematic quantitative measurements of epistasis – the deviation from additivity in the effect of alleles at different loci – on a given quantitative trait remain a major challenge. Here, we take a complementary approach of studying theoretically the effect of varying multiple parameters in a validated model of molecular signal transduction. To connect with the genotype/phenotype mapping we interpret parameters of the model as different loci with discrete choices of these parameters as alleles, which allows us to systematically examine the dependence of the signaling output – a quantitative trait – on the set of possible allelic combinations. We show quite generally that quantitative traits behave approximately additively (weak epistasis) when alleles correspond to small changes of parameters; epistasis appears as a result of large differences between alleles. When epistasis is relatively strong, it is concentrated in a sparse subset of loci and in low order (e.g. pair-wise) interactions. We find that focusing on interaction between loci that exhibit strong additive effects is an efficient way of identifying most of the epistasis. Our model study defines a theoretical framework for interpretation of experimental data and provides statistical predictions for the structure of genetic interaction expected for moderately complex biological circuits
Higher-order interactions in fitness landscapes are sparse
Biological fitness arises from interactions between molecules, genes, and
organisms. To discover the causative mechanisms of this complexity, we must
differentiate the significant interactions from a large number of
possibilities. Epistasis is the standard way to identify interactions in
fitness landscapes. However, this intuitive approach breaks down in higher
dimensions for example because the sign of epistasis takes on an arbitrary
meaning, and the false discovery rate becomes high. These limitations make it
difficult to evaluate the role of epistasis in higher dimensions. Here we
develop epistatic filtrations, a dimensionally-normalized approach to define
fitness landscape topography for higher dimensional spaces. We apply the method
to higher-dimensional datasets from genetics and the gut microbiome. This
reveals a sparse higher-order structure that often arises from lower-order.
Despite sparsity, these higher-order effects carry significant effects on
biological fitness and are consequential for ecology and evolution.Comment: 71 pages, various figure
Contrastive losses as generalized models of global epistasis
Fitness functions map large combinatorial spaces of biological sequences to
properties of interest. Inferring these multimodal functions from experimental
data is a central task in modern protein engineering. Global epistasis models
are an effective and physically-grounded class of models for estimating fitness
functions from observed data. These models assume that a sparse latent function
is transformed by a monotonic nonlinearity to emit measurable fitness. Here we
demonstrate that minimizing contrastive loss functions, such as the
Bradley-Terry loss, is a simple and flexible technique for extracting the
sparse latent function implied by global epistasis. We argue by way of a
fitness-epistasis uncertainty principle that the nonlinearities in global
epistasis models can produce observed fitness functions that do not admit
sparse representations, and thus may be inefficient to learn from observations
when using a Mean Squared Error (MSE) loss (a common practice). We show that
contrastive losses are able to accurately estimate a ranking function from
limited data even in regimes where MSE is ineffective. We validate the
practical utility of this insight by showing contrastive loss functions result
in consistently improved performance on benchmark tasks