14 research outputs found

    The organization of biological sequences into constrained and unconstrained parts determines fundamental properties of genotype-phenotype maps.

    Get PDF
    Biological information is stored in DNA, RNA and protein sequences, which can be understood as genotypes that are translated into phenotypes. The properties of genotype-phenotype (GP) maps have been studied in great detail for RNA secondary structure. These include a highly biased distribution of genotypes per phenotype, negative correlation of genotypic robustness and evolvability, positive correlation of phenotypic robustness and evolvability, shape-space covering, and a roughly logarithmic scaling of phenotypic robustness with phenotypic frequency. More recently similar properties have been discovered in other GP maps, suggesting that they may be fundamental to biological GP maps, in general, rather than specific to the RNA secondary structure map. Here we propose that the above properties arise from the fundamental organization of biological information into 'constrained' and 'unconstrained' sequences, in the broadest possible sense. As 'constrained' we describe sequences that affect the phenotype more immediately, and are therefore more sensitive to mutations, such as, e.g. protein-coding DNA or the stems in RNA secondary structure. 'Unconstrained' sequences, on the other hand, can mutate more freely without affecting the phenotype, such as, e.g. intronic or intergenic DNA or the loops in RNA secondary structure. To test our hypothesis we consider a highly simplified GP map that has genotypes with 'coding' and 'non-coding' parts. We term this the Fibonacci GP map, as it is equivalent to the Fibonacci code in information theory. Despite its simplicity the Fibonacci GP map exhibits all the above properties of much more complex and biologically realistic GP maps. These properties are therefore likely to be fundamental to many biological GP maps.SEA was supported by The Royal Society. SFG was supported by the EPSRC.This is the final version of the article. It was first available from Royal Society Publishing via http://dx.doi.org/10.1098/rsif.2015.072

    Genetic correlations greatly increase mutational robustness and can both reduce and enhance evolvability

    Get PDF
    Mutational neighbourhoods in genotype-phenotype (GP) maps are widely believed to be more likely to share characteristics than expected from random chance. Such genetic correlations should strongly influence evolutionary dynamics. We explore and quantify these intuitions by comparing three GP maps—a model for RNA secondary structure, the HP model for protein tertiary structure, and the Polyomino model for protein quaternary structure—to a simple random null model that maintains the number of genotypes mapping to each phenotype, but assigns genotypes randomly. The mutational neighbourhood of a genotype in these GP maps is much more likely to contain genotypes mapping to the same phenotype than in the random null model. Such neutral correlations can be quantified by the robustness to mutations, which can be many orders of magnitude larger than that of the null model, and crucially, above the critical threshold for the formation of large neutral networks of mutationally connected genotypes which enhance the capacity for the exploration of phenotypic novelty. Thus neutral correlations increase evolvability. We also study non-neutral correlations: Compared to the null model, i) If a particular (non-neutral) phenotype is found once in the 1-mutation neighbourhood of a genotype, then the chance of finding that phenotype multiple times in this neighbourhood is larger than expected; ii) If two genotypes are connected by a single neutral mutation, then their respective non-neutral 1-mutation neighbourhoods are more likely to be similar; iii) If a genotype maps to a folding or self-assembling phenotype, then its non-neutral neighbours are less likely to be a potentially deleterious non-folding or non-assembling phenotype. Non-neutral correlations of type i) and ii) reduce the rate at which new phenotypes can be found by neutral exploration, and so may diminish evolvability, while non-neutral correlations of type iii) may instead facilitate evolutionary exploration and so increase evolvability

    Beyond the Hypercube:Evolutionary Accessibility of Fitness Landscapes with Realistic Mutational Networks

    Get PDF
    Evolutionary pathways describe trajectories of biological evolution in the space of different variants of organisms (genotypes). The probability of existence and the number of evolutionary pathways that lead from a given genotype to a better-adapted genotype are important measures of accessibility of local fitness optima and the reproducibility of evolution. Both quantities have been studied in simple mathematical models where genotypes are represented as binary sequences of two types of basic units, and the network of permitted mutations between the genotypes is a hypercube graph. However, it is unclear how these results translate to the biologically relevant case in which genotypes are represented by sequences of more than two units, for example four nucleotides (DNA) or 20 amino acids (proteins), and the mutational graph is not the hypercube. Here we investigate accessibility of the best-adapted genotype in the general case of K > 2 units. Using computer generated and experimental fitness landscapes we show that accessibility of the global fitness maximum increases with K and can be much higher than for binary sequences. The increase in accessibility comes from the increase in the number of indirect trajectories exploited by evolution for higher K. As one of the consequences, the fraction of genotypes that are accessible increases by three orders of magnitude when the number of units K increases from 2 to 16 for landscapes of size N ∼ 106 genotypes. This suggests that evolution can follow many different trajectories on such landscapes and the reconstruction of evolutionary pathways from experimental data might be an extremely difficult task

    A tractable genotype-phenotype map modelling the self-assembly of protein quaternary structure.

    No full text
    The mapping between biological genotypes and phenotypes is central to the study of biological evolution. Here, we introduce a rich, intuitive and biologically realistic genotype-phenotype (GP) map that serves as a model of self-assembling biological structures, such as protein complexes, and remains computationally and analytically tractable. Our GP map arises naturally from the self-assembly of polyomino structures on a two-dimensional lattice and exhibits a number of properties: redundancy (genotypes vastly outnumber phenotypes), phenotype bias (genotypic redundancy varies greatly between phenotypes), genotype component disconnectivity (phenotypes consist of disconnected mutational networks) and shape space covering (most phenotypes can be reached in a small number of mutations). We also show that the mutational robustness of phenotypes scales very roughly logarithmically with phenotype redundancy and is positively correlated with phenotypic evolvability. Although our GP map describes the assembly of disconnected objects, it shares many properties with other popular GP maps for connected units, such as models for RNA secondary structure or the hydrophobic-polar (HP) lattice model for protein tertiary structure. The remarkable fact that these important properties similarly emerge from such different models suggests the possibility that universal features underlie a much wider class of biologically realistic GP maps

    Birthweight and patterns of postnatal weight gain in very and extremely preterm babies in England and Wales, 2008-19: a cohort study

    No full text
    BACKGROUND: Intrauterine and postnatal weight are widely regarded as biomarkers of fetal and neonatal wellbeing, but optimal weight gain following preterm birth is unknown. We aimed to describe changes over time in birthweight and postnatal weight gain in very and extremely preterm babies, in relation to major morbidity and healthy survival. METHODS: In this cohort study, we used whole-population data from the UK National Neonatal Research Database for infants below 32 weeks gestation admitted to neonatal units in England and Wales between Jan 1, 2008, and Dec 31, 2019. We used non-linear Gaussian process to estimate monthly trends, and Bayesian multilevel regression to estimate unadjusted and adjusted coefficients. We evaluated birthweight; weight change from birth to 14 days; weight at 36 weeks postmenstrual age; associated Z scores; and longitudinal weights for babies surviving to 36 weeks postmenstrual age with and without major morbidities. We adjusted birthweight for antenatal, perinatal, and demographic variables. We additionally adjusted change in weight at 14 days and weight at 36 weeks postmenstrual age, and their Z scores, for postnatal variables. FINDINGS: The cohort comprised 90 817 infants. Over the 12-year period, mean differences adjusted for antenatal, perinatal, demographic, and postnatal variables were 0 g (95% compatibility interval -7 to 7) for birthweight (-0·01 [-0·05 to 0·03] for change in associated Z score); 39 g (26 to 51) for change in weight from birth to 14 days (0·14 [0·08 to 0·19] for change in associated Z score); and 105 g (81 to 128) for weight at 36 weeks postmenstrual age (0·27 [0·21 to 0·33] for change in associated Z score). Greater weight at 36 weeks postmenstrual age was robust to additional adjustment for enteral nutritional intake. In babies surviving without major morbidity, weight velocity in all gestational age groups stabilised at around 34 weeks postmenstrual age at 16-25 g per day along parallel percentile lines. INTERPRETATION: The birthweight of very and extremely preterm babies has remained stable over 12 years. Early postnatal weight loss has decreased, and subsequent weight gain has increased, but weight at 36 weeks postmenstrual age is consistently below birth percentile. In babies without major morbidity, weight velocity follows a consistent trajectory, offering opportunity to construct novel preterm growth curves despite lack of knowledge of optimal postnatal weight gain. FUNDING: UK Medical Research Council

    Identification of variation in nutritional practice in neonatal units in England and association with clinical outcomes using agnostic machine learning

    No full text
    We used agnostic, unsupervised machine learning to cluster a large clinical database of information on infants admitted to neonatal units in England. Our aim was to obtain insights into nutritional practice, an area of central importance in newborn care, utilising the UK National Neonatal Research Database (NNRD). We performed clustering on time-series data of daily nutritional intakes for very preterm infants born at a gestational age less than 32 weeks (n = 45,679) over a six-year period. This revealed 46 nutritional clusters heterogeneous in size, showing common interpretable clinical practices alongside rarer approaches. Nutritional clusters with similar admission profiles revealed associations between nutritional practice, geographical location and outcomes. We show how nutritional subgroups may be regarded as distinct interventions and tested for associations with measurable outcomes. We illustrate the potential for identifying relationships between nutritional practice and outcomes with two examples, discharge weight and bronchopulmonary dysplasia (BPD). We identify the well-known effect of formula milk on greater discharge weight as well as support for the plausible, but insufficiently evidenced view that human milk is protective against BPD. Our framework highlights the potential of agnostic machine learning approaches to deliver clinical practice insights and generate hypotheses using routine data

    Changes in neonatal admissions, care processes and outcomes in England and Wales during the COVID-19 pandemic: a whole population cohort study

    No full text
    Objectives: The COVID-19 pandemic instigated multiple societal and healthcare interventions with potential to affect perinatal practice. We evaluated population-level changes in preterm and full-term admissions to neonatal units, care processes and outcomes. Design: Observational cohort study using the UK National Neonatal Research Database. Setting: England and Wales. Participants: Admissions to National Health Service neonatal units from 2012 to 2020. Main outcome measures: Admissions by gestational age, ethnicity and Index of Multiple Deprivation, and key care processes and outcomes. Methods: We calculated differences in numbers and rates between April and June 2020 (spring), the first 3 months of national lockdown (COVID-19 period), and December 2019–February 2020 (winter), prior to introduction of mitigation measures, and compared them with the corresponding differences in the previous 7 years. We considered the COVID-19 period highly unusual if the spring–winter difference was smaller or larger than all previous corresponding differences, and calculated the level of confidence in this conclusion. Results: Marked fluctuations occurred in all measures over the 8 years with several highly unusual changes during the COVID-19 period. Total admissions fell, having risen over all previous years (COVID-19 difference: −1492; previous 7-year difference range: +100, +1617; p<0.001); full-term black admissions rose (+66; −64, +35; p<0.001) whereas Asian (−137; −14, +101; p<0.001) and white (−319; −235, +643: p<0.001) admissions fell. Transfers to higher and lower designation neonatal units increased (+129; −4, +88; p<0.001) and decreased (−47; −25, +12; p<0.001), respectively. Total preterm admissions decreased (−350; −26, +479; p<0.001). The fall in extremely preterm admissions was most marked in the two lowest socioeconomic quintiles. Conclusions: Our findings indicate substantial changes occurred in care pathways and clinical thresholds, with disproportionate effects on black ethnic groups, during the immediate COVID-19 period, and raise the intriguing possibility that non-healthcare interventions may reduce extremely preterm births

    Symmetry and simplicity spontaneously emerge from the algorithmic nature of evolution

    No full text
    Engineers routinely design systems to be modular and symmetric in order to increase robustness to perturbations and to facilitate alterations at a later date. Biological structures also frequently exhibit modularity and symmetry, but the origin of such trends is much less well understood. It can be tempting to assume—by analogy to engineering design—that symmetry and modularity arise from natural selection. However, evolution, unlike engineers, cannot plan ahead, and so these traits must also afford some immediate selective advantage which is hard to reconcile with the breadth of systems where symmetry is observed. Here we introduce an alternative nonadaptive hypothesis based on an algorithmic picture of evolution. It suggests that symmetric structures preferentially arise not just due to natural selection but also because they require less specific information to encode and are therefore much more likely to appear as phenotypic variation through random mutations. Arguments from algorithmic information theory can formalize this intuition, leading to the prediction that many genotype–phenotype maps are exponentially biased toward phenotypes with low descriptional complexity. A preference for symmetry is a special case of this bias toward compressible descriptions. We test these predictions with extensive biological data, showing that protein complexes, RNA secondary structures, and a model gene regulatory network all exhibit the expected exponential bias toward simpler (and more symmetric) phenotypes. Lower descriptional complexity also correlates with higher mutational robustness, which may aid the evolution of complex modular assemblies of multiple components

    Human genome variation and the concept of genotype networks

    Get PDF
    Genotype networks are a method used in systems biology to study the "innovability" of a set of genotypes having the same phenotype. In the past they have been applied to determine the genetic heterogeneity, and stability to mutations, of systems such as metabolic networks and RNA folds. Recently, they have been the base for re-conciliating the two neutralist and selectionist schools on evolution. Here, we adapted the concept of genotype networks to the study of population genetics data, applying them to the 1000 Genomes dataset. We used networks composed of short haplotypes of Single Nucleotide Variants (SNV), and defined phenotypes as the presence or absence of a haplotype in a human population. We used coalescent simulations to determine if the number of samples in the 1000 Genomes dataset is large enough to represent the genetic variation of real populations. The result is a scan of how properties related to the genetic heterogeneity and stability to mutations are distributed along the human genome. We found that genes involved in acquired immunity, such as some HLA and MHC genes, tend to have the most heterogeneous and connected networks; and we have also found that there is a small, but significant difference between networks of coding regions and those of non-coding regions, suggesting that coding regions are both richer in genotype diversity, and more stable to mutations. Together, the work presented here may constitute a starting point for applying genotype networks to study genome variation, as larger datasets of next-generation data will become available
    corecore