201 research outputs found

    Sequence Determinants of the Individual and Collective Behaviour of Intrinsically Disordered Proteins

    Get PDF
    Intrinsically disordered proteins and protein regions (IDPs) represent around thirty percent of the eukaryotic proteome. IDPs do not fold into a set three dimensional structure, but instead exist in an ensemble of inter-converting states. Despite being disordered, IDPs are decidedly not random; well-defined - albeit transient - local and long-range interactions give rise to an ensemble with distinct statistical biases over many length-scales. Among a variety of cellular roles, IDPs drive and modulate the formation of phase separated intracellular condensates, non-stoichiometric assemblies of protein and nucleic acid that serve many functions. In this work, we have explored how the amino acid sequence of IDPs determines their conformational behaviour, and how sequence and single chain behaviour influence their collective behaviour in the context of phase separation. In part I, in a series of studies, we used simulation, theory, and statistical analysis coupled with a wide range of experimental approaches to uncover novel rules that further explore how primary sequence and local structure influence the global and local behaviour of disordered proteins, with direct implications for protein function and evolution. We found that amino acid sidechains counteract the intrinsic collapse of the peptide backbone, priming the backbone for interaction and providing a fully reconciliatory explanation for the mechanism of action associated with the denaturants urea and GdmCl. We discovered that proline can engender a conformational buffering effect in IDPs to counteract standard electrostatic effects, and that the patterning those proline residues can be a crucial determinant of the conformational ensemble. We developed a series of tools for analysing primary sequences on a proteome wide scale and used them to discover that different organisms can have substantially different average sequence properties. Finally, we determined that for the normally folded protein NTL9, the unfolded state under folding conditions is relatively expanded but has well defined native and non-native structural preferences. In part II, we identified a novel mode of phase separation in biology, and explored how this could be tuned through sequence design. We discovered that phase separated liquids can be many orders of magnitude more dilute than simple mean-field theories would predict, and developed an analytic framework to explain and understand this phenomenon. Finally, we designed, developed and implemented a novel lattice-based simulation engine (PIMMS) to provide sequence-specific insight into the determinants of conformational behaviour and phase separation. PIMMS allows us to accurately and rapidly generate sequence-specific conformational ensembles and run simulations of hundreds of polymers with the goal of allowing us to systematically elucidate the link between primary sequence of phase separation

    PARROT is a flexible recurrent neural network framework for analysis of large protein datasets

    Get PDF
    The rise of high-throughput experiments has transformed how scientists approach biological questions. The ubiquity of large-scale assays that can test thousands of samples in a day has necessitated the development of new computational approaches to interpret this data. Among these tools, machine learning approaches are increasingly being utilized due to their ability to infer complex nonlinear patterns from high-dimensional data. Despite their effectiveness, machine learning (and in particular deep learning) approaches are not always accessible or easy to implement for those with limited computational expertise. Here we present PARROT, a general framework for training and applying deep learning-based predictors on large protein datasets. Using an internal recurrent neural network architecture, PARROT is capable of tackling both classification and regression tasks while only requiring raw protein sequences as input. We showcase the potential uses of PARROT on three diverse machine learning tasks: predicting phosphorylation sites, predicting transcriptional activation function of peptides generated by high-throughput reporter assays, and predicting the fibrillization propensity of amyloid beta with data generated by deep mutational scanning. Through these examples, we demonstrate that PARROT is easy to use, performs comparably to state-of-the-art computational tools, and is applicable for a wide array of biological problems

    Steady-state fluctuations of a genetic feedback loop with fluctuating rate parameters using the unified colored noise approximation

    Get PDF
    A common model of stochastic auto-regulatory gene expression describes promoter switching via cooperative protein binding, effective protein production in the active state and dilution of proteins. Here we consider an extension of this model whereby colored noise with a short correlation time is added to the reaction rate parameters -- we show that when the size and timescale of the noise is appropriately chosen it accounts for fast reactions that are not explicitly modelled, e.g., in models with no mRNA description, fluctuations in the protein production rate can account for rapid multiple stages of nuclear mRNA processing which precede translation in eukaryotes. We show how the unified colored noise approximation can be used to derive expressions for the protein number distribution that is in good agreement with stochastic simulations. We find that even when the noise in the rate parameters is small, the protein distributions predicted by our model can be significantly different than models assuming constant reaction rates.Comment: 33 pages, 10 figure

    Distinguishing between models of mammalian gene expression : telegraph-like models versus mechanistic models

    Get PDF
    Funding Information: S.B. and R.G. were supported by a Leverhulme Trust grant no. (RPG-2018-423). J.H. was supported by a BBSRC EASTBIO PhD studentship.Two-state models (telegraph-like models) have a successful history of predicting distributions of cellular and nascent mRNA numbers that can well fit experimental data. These models exclude key rate limiting steps, and hence it is unclear why they are able to accurately predict the number distributions. To answer this question, here we compare these models to a novel stochastic mechanistic model of transcription in mammalian cells that presents a unified description of transcriptional factor, polymerase and mature mRNA dynamics. We show that there is a large region of parameter space where the first, second and third moments of the distributions of the waiting times between two consecutively produced transcripts (nascent or mature) of two-state and mechanistic models exactly match. In this region: (i) one can uniquely express the two-state model parameters in terms of those of the mechanistic model, (ii) the models are practically indistinguishable by comparison of their transcript numbers distributions, and (iii) they are distinguishable from the shape of their waiting time distributions. Our results clarify the relationship between different gene expression models and identify a means to select between them from experimental data.Publisher PDFPeer reviewe

    Distinguishing between models of mammalian gene expression:Telegraph-like models versus mechanistic models

    Get PDF
    Two-state models (telegraph-like models) have a successful history of predicting distributions of cellular and nascent mRNA numbers that can well fit experimental data. These models exclude key rate limiting steps, and hence it is unclear why they are able to accurately predict the number distributions. To answer this question, here we compare these models to a novel stochastic mechanistic model of transcription in mammalian cells that presents a unified description of transcriptional factor, polymerase and mature mRNA dynamics. We show that there is a large region of parameter space where the first, second and third moments of the distributions of the waiting times between two consecutively produced transcripts (nascent or mature) of two-state and mechanistic models exactly match. In this region: (i) one can uniquely express the two-state model parameters in terms of those of the mechanistic model, (ii) the models are practically indistinguishable by comparison of their transcript numbers distributions, and (iii) they are distinguishable from the shape of their waiting time distributions. Our results clarify the relationship between different gene expression models and identify a means to select between them from experimental data. </p

    Stochastic modeling of auto-regulatory genetic feedback loops

    Get PDF
    Auto-regulatory feedback loops are one of the most common network motifs. A wide variety of stochastic models have been constructed to understand how the fluctuations in protein numbers in these loops are influenced by the kinetic parameters of the main biochemical steps. These models differ according to (i) which sub-cellular processes are explicitly modelled; (ii) the modelling methodology employed (discrete, continuous or hybrid); (iii) whether they can be analytically solved for the steady-state distribution of protein numbers. We discuss the assumptions and properties of the main models in the literature, summarize our current understanding of the relationship between them and highlight some of the insights gained through modelling.Comment: 12 pages, 3 figures. Submitted to Biophysical Journa

    Collaborative service delivery to address public health issues within a musculoskeletal setting: evaluation of the Healthy Mind, Healthy Body project

    Get PDF
    Background: There is a need for a greater focus on public health and its impact on MSK conditions within healthcare delivery and physiotherapists are well positioned to support this. Outpatient Physiotherapy Musculoskeletal (MSK) services traditionally focus on rehabilitation and physical exercise yet many service users require support to improve both their mental and physical health. Aims: This innovative service improvement aimed to embed integrated health promotion within MSK physiotherapy service delivery. Method: A physiotherapy-led multi-professional team introduced patients to other community-based support services to address wider health needs. Findings: Service evaluation demonstrated a high uptake of self-referral to those community services, validating the potential benefit for management of MSK conditions. Positive patient feedback indicates that patients valued the service and were well supported to engage with health improvement. Conclusions: MSK Physiotherapy services need to consider the wider aspects of health putting public health at the heart of MSK service delivery

    Model reduction, mechanistic modelling and transience in models of stochastic chemical kinetics

    Get PDF
    Now, it is long known that gene expression and chemical kinetics are subject to random fluctuations. These lead to deviations from deterministic models that do not account for the random nature of biochemical kinetics. Successfully incorporating these stochastic dynamics is of great interest so that one can better model, and more closely understand, the intricate phenomena inherent in biological mechanisms. Many previous studies have been conducted in modelling such processes stochastically, for instance processes such as genetic autoregulation, Michaelis-Menten enzyme action and ant recruitment models. However, the majority of these studies explore only the steady state solutions of such processes while assuming mass-action kinetics, without considering: (1) extrinsic noise, (2) transience from an initial condition, or even (3) the finite, non-continuous nature of molecule or agent numbers. This thesis focuses on the aforementioned complex systems, with an emphasis on how to use toy models in responsible and informed ways. Responsible refers to a knowledge of how good our approximations of microscopic dynamics are and their limitations: Do we understand the assumptions that commonly employed approximations rely on? Informed refers to whether a model we design is sufficiently minimal or complex to represent the underlying biochemical (or economical) kinetics: Can we use alternative models of similar simplicity (possibly mechanistically informed) to more properly capture the dynamics of the system we are attempting to model? Further issues pursued in this thesis are whether common approximative methods can be extended to effectively include details of more complex underlying dynamics, or whether we can move beyond typical steady state solutions and explore transience from an initial condition. There are several main findings from our studies. We find that for non mass-action Hill-type propensities, often used in biochemical kinetics, that typically only assume time scale separation as the basis of approximation, that finite molecule number effects can greatly perturb their accuracy. Then, we show that the addition of non-Gaussian colored noise to biochemical rate parameters can capture intricate characteristics of gene expression that are not explicitly modelled. For common two-state gene models, we explore why they seem to be so effective at approximating gene expression, where it is known that several key rate limiting steps are ignored. Finally, we develop transient solutions to master equations describing Michaelis-Menten enzyme kinetics and ant recruitment, and we show how to extend the solutions therein to more general forms

    SARS-CoV-2 requires cholesterol for viral entry and pathological syncytia formation

    Get PDF
    Many enveloped viruses induce multinucleated cells (syncytia), reflective of membrane fusion events caused by the same machinery that underlies viral entry. These syncytia are thought to facilitate replication and evasion of the host immune response. Here, we report that co-culture of human cells expressing the receptor ACE2 with cells expressing SARS-CoV-2 spike, results in synapse-like intercellular contacts that initiate cell-cell fusion, producing syncytia resembling those we identify in lungs of COVID-19 patients. To assess the mechanism of spike/ACE2-driven membrane fusion, we developed a microscopy-based, cell-cell fusion assay to screen ~6000 drugs and \u3e30 spike variants. Together with quantitative cell biology approaches, the screen reveals an essential role for biophysical aspects of the membrane, particularly cholesterol-rich regions, in spike-mediated fusion, which extends to replication-competent SARS-CoV-2 isolates. Our findings potentially provide a molecular basis for positive outcomes reported in COVID-19 patients taking statins and suggest new strategies for therapeutics targeting the membrane of SARS-CoV-2 and other fusogenic viruses
    • …
    corecore