139 research outputs found
HyperTraPS: Inferring probabilistic patterns of trait acquisition in evolutionary and disease progression pathways
The explosion of data throughout the biomedical sciences provides unprecedented opportunities to learn about the dynamics of evolution and disease progression, but harnessing these large and diverse datasets remains challenging. Here, we describe a highly generalisable statistical platform to infer the dynamic pathways by which many, potentially interacting, discrete traits are acquired or lost over time in biomedical systems. The platform uses HyperTraPS (hypercubic transition path sampling) to learn progression pathways from cross-sectional, longitudinal, or phylogenetically-linked data with unprecedented efficiency, readily distinguishing multiple competing pathways, and identifying the most parsimonious mechanisms underlying given observations. Its Bayesian structure quantifies uncertainty in pathway structure and allows interpretable predictions of behaviours, such as which symptom a patient will acquire next. We exploit the model’s topology to provide visualisation tools for intuitive assessment of multiple, variable pathways. We apply the method to ovarian cancer progression and the evolution of multidrug resistance in tuberculosis, demonstrating its power to reveal previously undetected dynamic pathways
Genetic correlations greatly increase mutational robustness and can both reduce and enhance evolvability
Mutational neighbourhoods in genotype-phenotype (GP) maps are widely believed to be more likely to share characteristics than expected from random chance. Such genetic correlations should strongly influence evolutionary dynamics. We explore and quantify these intuitions by comparing three GP maps—a model for RNA secondary structure, the HP model for protein tertiary structure, and the Polyomino model for protein quaternary structure—to a simple random null model that maintains the number of genotypes mapping to each phenotype, but assigns genotypes randomly. The mutational neighbourhood of a genotype in these GP maps is much more likely to contain genotypes mapping to the same phenotype than in the random null model. Such neutral correlations can be quantified by the robustness to mutations, which can be many orders of magnitude larger than that of the null model, and crucially, above the critical threshold for the formation of large neutral networks of mutationally connected genotypes which enhance the capacity for the exploration of phenotypic novelty. Thus neutral correlations increase evolvability. We also study non-neutral correlations: Compared to the null model, i) If a particular (non-neutral) phenotype is found once in the 1-mutation neighbourhood of a genotype, then the chance of finding that phenotype multiple times in this neighbourhood is larger than expected; ii) If two genotypes are connected by a single neutral mutation, then their respective non-neutral 1-mutation neighbourhoods are more likely to be similar; iii) If a genotype maps to a folding or self-assembling phenotype, then its non-neutral neighbours are less likely to be a potentially deleterious non-folding or non-assembling phenotype. Non-neutral correlations of type i) and ii) reduce the rate at which new phenotypes can be found by neutral exploration, and so may diminish evolvability, while non-neutral correlations of type iii) may instead facilitate evolutionary exploration and so increase evolvability
Learning machines for health and beyond
Machine learning techniques are effective for building predictive models
because they are good at identifying patterns in large datasets. Development of
a model for complex real life problems often stops at the point of publication,
proof of concept or when made accessible through some mode of deployment.
However, a model in the medical domain risks becoming obsolete as soon as
patient demographic changes. The maintenance and monitoring of predictive
models post-publication is crucial to guarantee their safe and effective long
term use. As machine learning techniques are effectively trained to look for
patterns in available datasets, the performance of a model for complex real
life problems will not peak and remain fixed at the point of publication or
even point of deployment. Rather, data changes over time, and they also changed
when models are transported to new places to be used by new demography.Comment: 12 pages, 3 figure
Beyond shareholder primacy? Reflections on the trajectory of UK corporate governance.
Core institutions of UK corporate governance, in particular the City Code on Takeovers and Mergers, the Combined Code on Corporate Governance and the law on directors’ duties, are strongly orientated towards the norm of shareholder primacy. Beyond the core, however, stakeholder interests are better represented, in particular at the intersection of insolvency and employment law. This reflects the influence of European Community laws on information and consultation of employees. In addition, there are signs that some institutional shareholders are redirecting their investment strategies, under government encouragement, away from a focus on short-term returns, in such a way as to favour stakeholder-inclusive practices by firms. On this basis we suggest that the UK system is currently in a state of flux and that the debate over shareholder primacy has not been concluded
Precision identification of high-risk phenotypes and progression pathways in severe malaria without requiring longitudinal data
More than 400,000 deaths from severe malaria (SM) are
reported every year, mainly in African children. The diversity
of clinical presentations associated with SM indicates important
differences in disease pathogenesis that require specific
treatment, and this clinical heterogeneity of SM remains poorly
understood. Here, we apply tools from machine learning and
model-based inference to harness large-scale data and dissect
the heterogeneity in patterns of clinical features associated
with SM in 2904 Gambian children admitted to hospital with
malaria. This quantitative analysis reveals features predicting
the severity of individual patient outcomes, and the dynamic
pathways of SM progression, notably inferred without requiring
longitudinal observations. Bayesian inference of these pathways
allows us assign quantitative mortality risks to individual
patients. By independently surveying expert practitioners, we
show that this data-driven approach agrees with and expands the
current state of knowledge on malaria progression, while
simultaneously providing a data-supported framework for
predicting clinical risk
Beyond the Hypercube:Evolutionary Accessibility of Fitness Landscapes with Realistic Mutational Networks
Evolutionary pathways describe trajectories of biological evolution in the space of different variants of organisms (genotypes). The probability of existence and the number of evolutionary pathways that lead from a given genotype to a better-adapted genotype are important measures of accessibility of local fitness optima and the reproducibility of evolution. Both quantities have been studied in simple mathematical models where genotypes are represented as binary sequences of two types of basic units, and the network of permitted mutations between the genotypes is a hypercube graph. However, it is unclear how these results translate to the biologically relevant case in which genotypes are represented by sequences of more than two units, for example four nucleotides (DNA) or 20 amino acids (proteins), and the mutational graph is not the hypercube. Here we investigate accessibility of the best-adapted genotype in the general case of K > 2 units. Using computer generated and experimental fitness landscapes we show that accessibility of the global fitness maximum increases with K and can be much higher than for binary sequences. The increase in accessibility comes from the increase in the number of indirect trajectories exploited by evolution for higher K. As one of the consequences, the fraction of genotypes that are accessible increases by three orders of magnitude when the number of units K increases from 2 to 16 for landscapes of size N ∼ 106 genotypes. This suggests that evolution can follow many different trajectories on such landscapes and the reconstruction of evolutionary pathways from experimental data might be an extremely difficult task
- …