30 research outputs found

    Thermodynamic State Ensemble Models of cis-Regulation

    Get PDF
    A major goal in computational biology is to develop models that accurately predict a gene's expression from its surrounding regulatory DNA. Here we present one class of such models, thermodynamic state ensemble models. We describe the biochemical derivation of the thermodynamic framework in simple terms, and lay out the mathematical components that comprise each model. These components include (1) the possible states of a promoter, where a state is defined as a particular arrangement of transcription factors bound to a DNA promoter, (2) the binding constants that describe the affinity of the protein–protein and protein–DNA interactions that occur in each state, and (3) whether each state is capable of transcribing. Using these components, we demonstrate how to compute a cis-regulatory function that encodes the probability of a promoter being active. Our intention is to provide enough detail so that readers with little background in thermodynamics can compose their own cis-regulatory functions. To facilitate this goal, we also describe a matrix form of the model that can be easily coded in any programming language. This formalism has great flexibility, which we show by illustrating how phenomena such as competition between transcription factors and cooperativity are readily incorporated into these models. Using this framework, we also demonstrate that Michaelis-like functions, another class of cis-regulatory models, are a subset of the thermodynamic framework with specific assumptions. By recasting Michaelis-like functions as thermodynamic functions, we emphasize the relationship between these models and delineate the specific circumstances representable by each approach. Application of thermodynamic state ensemble models is likely to be an important tool in unraveling the physical basis of combinatorial cis-regulation and in generating formalisms that accurately predict gene expression from DNA sequence

    PTRE-seq reveals mechanism and interactions of RNA binding proteins and miRNAs

    Get PDF
    A large number of RNA binding proteins (RBPs) and miRNAs bind to the 3′ untranslated regions of mRNA, but methods to dissect their function and interactions are lacking. Here the authors introduce post-transcriptional regulatory element sequencing (PTRE-seq) to dissect sequence preferences, interactions and consequences of RBP and miRNA binding

    Intrinsic noise profoundly alters the dynamics and steady state of morphogen-controlled bistable genetic switches

    Get PDF
    During tissue development, patterns of gene expression determine the spatial arrangement of cell types. In many cases, gradients of secreted signaling molecules - morphogens - guide this process. The continuous positional information provided by the gradient is converted into discrete cell types by the downstream transcriptional network that responds to the morphogen. A mechanism commonly used to implement a sharp transition between two adjacent cell fates is the genetic toggle switch, composed of cross-repressing transcriptional determinants. Previous analyses emphasize the steady state output of these mechanisms. Here, we explore the dynamics of the toggle switch and use exact numerical simulations of the kinetic reactions, the Chemical Langevin Equation, and Minimum Action Path theory to establish a framework for studying the effect of gene expression noise on patterning time and boundary position. This provides insight into the time scale, gene expression trajectories and directionality of stochastic switching events between cell states. Taking gene expression noise into account predicts that the final boundary position of a morphogen-induced toggle switch, although robust to changes in the details of the noise, is distinct from that of the deterministic system. Moreover, stochastic switching introduces differences in patterning time along the morphogen gradient that result in a patterning wave propagating away from the morphogen source. The velocity of this wave is influenced by noise; the wave sharpens and slows as it advances and may never reach steady state in a biologically relevant time. This could explain experimentally observed dynamics of pattern formation. Together the analysis reveals the importance of dynamical transients for understanding morphogen-driven transcriptional networks and indicates that gene expression noise can qualitatively alter developmental patterning

    Biophysical models of cis-regulation as interpretable neural networks

    Get PDF
    Abstract The adoption of deep learning techniques in genomics has been hindered by the difficulty of mechanistically interpreting the models that these techniques produce. In recent years, a variety of post-hoc attribution methods have been proposed for addressing this neural network interpretability problem in the context of gene regulation. Here we describe a complementary way of approaching this problem. Our strategy is based on the observation that two large classes of biophysical models of cis-regulatory mechanisms can be expressed as deep neural networks in which nodes and weights have explicit physiochemical interpretations. We also demonstrate how such biophysical networks can be rapidly inferred, using modern deep learning frameworks, from the data produced by certain types of massively parallel reporter assays (MPRAs). These results suggest a scalable strategy for using MPRAs to systematically characterize the biophysical basis of gene regulation in a wide range of biological contexts. They also highlight gene regulation as a promising venue for the development of scientifically interpretable approaches to deep learning

    Thermodynamic Modelling of Transcriptional Control: A Sensitivity Analysis

    Get PDF
    Modelling is a tool used to decipher the biochemical mechanisms involved in transcriptional control. Experimental evidence in genetics is usually supported by theoretical models in order to evaluate the effects of all the possible interactions that can occur in these complicated processes. Models derived from the thermodynamic method are critical in this labour because they are able to take into account multiple mechanisms operating simultaneously at the molecular micro-scale and relate them to transcriptional initiation at the tissular macro-scale. This work is devoted to adapting computational techniques to this context in order to theoretically evaluate the role played by several biochemical mechanisms. The interest of this theoretical analysis relies on the fact that it can be contrasted against those biological experiments where the response to perturbations in the transcriptional machinery environment is evaluated in terms of genetically activated/repressed regions. The theoretical reproduction of these experiments leads to a sensitivity analysis whose results are expressed in terms of the elasticity of a threshold function determining those activated/repressed regions. The study of this elasticity function in thermodynamic models already proposed in the literature reveals that certain modelling approaches can alter the balance between the biochemical mechanisms considered, and this can cause false/misleading outcomes. The reevaluation of classical thermodynamic models gives us a more accurate and complete picture of the interactions involved in gene regulation and transcriptional control, which enables more specific predictions. This sensitivity approach provides a definite advantage in the interpretation of a wide range of genetic experimental results.MINECO-Feder (Spanish Government) FPI2015/074837 RTI2018-098850-B-100Consejeria de Economia, Innovacion, Ciencia y Empleo, Junta de Andalucia (Andalucia Government) PY18-RT-2422Junta de Andalucia A-FQM-311-UGR18 B-FQM-580-UGR2

    Evolution of new regulatory functions on biophysically realistic fitness landscapes

    Get PDF
    Regulatory networks consist of interacting molecules with a high degree of mutual chemical specificity. How can these molecules evolve when their function depends on maintenance of interactions with cognate partners and simultaneous avoidance of deleterious "crosstalk" with non-cognate molecules? Although physical models of molecular interactions provide a framework in which co-evolution of network components can be analyzed, most theoretical studies have focused on the evolution of individual alleles, neglecting the network. In contrast, we study the elementary step in the evolution of gene regulatory networks: duplication of a transcription factor followed by selection for TFs to specialize their inputs as well as the regulation of their downstream genes. We show how to coarse grain the complete, biophysically realistic genotype-phenotype map for this process into macroscopic functional outcomes and quantify the probability of attaining each. We determine which evolutionary and biophysical parameters bias evolutionary trajectories towards fast emergence of new functions and show that this can be greatly facilitated by the availability of "promiscuity-promoting" mutations that affect TF specificity

    The Influence of Promoter Architectures and Regulatory Motifs on Gene Expression in Escherichia coli

    Get PDF
    The ability to regulate gene expression is of central importance for the adaptability of living organisms to changes in their external and internal environment. At the transcriptional level, binding of transcription factors (TFs) in the promoter region can modulate the transcription rate, hence making TFs central players in gene regulation. For some model organisms, information about the locations and identities of discovered TF binding sites have been collected in continually updated databases, such as RegulonDB for the well-studied case of E. coli. In order to reveal the general principles behind the binding-site arrangement and function of these regulatory architectures we propose a random promoter architecture model that preserves the overall abundance of binding sites to identify overrepresented binding site configurations. This model is analogous to the random network model used in the study of genetic network motifs, where regulatory motifs are identified through their overrepresentation with respect to a “randomly connected” genetic network. Using our model we identify TF pairs which coregulate operons in an overrepresented fashion, or individual TFs which act at multiple binding sites per promoter by, for example, cooperative binding, DNA looping, or through multiple binding domains. We furthermore explore the relationship between promoter architecture and gene expression, using three different genome-wide protein copy number censuses. Perhaps surprisingly, we find no systematic correlation between the number of activator and repressor binding sites regulating a gene and the level of gene expression. A position-weight-matrix model used to estimate the binding affinity of RNA polymerase (RNAP) to the promoters of activated and repressed genes suggests that this lack of correlation might in part be due to differences in basal transcription levels, with repressed genes having a higher basal activity level. This quantitative catalogue relating promoter architecture and function provides a first step towards genome-wide predictive models of regulatory function

    Transcription factor interactions explain the context-dependent activity of CRX binding sites

    Get PDF
    The effects of transcription factor binding sites (TFBSs) on the activity of a cis-regulatory element (CRE) depend on the local sequence context. In rod photoreceptors, binding sites for the transcription factor (TF) Cone-rod homeobox (CRX) occur in both enhancers and silencers, but the sequence context that determines whether CRX binding sites contribute to activation or repression of transcription is not understood. To investigate the context-dependent activity of CRX sites, we fit neural network-based models to the activities of synthetic CREs composed of photoreceptor TFBSs. The models revealed that CRX binding sites consistently make positive, independent contributions to CRE activity, while negative homotypic interactions between sites cause CREs composed of multiple CRX sites to function as silencers. The effects of negative homotypic interactions can be overcome by the presence of other TFBSs that either interact cooperatively with CRX sites or make independent positive contributions to activity. The context-dependent activity of CRX sites is thus determined by the balance between positive heterotypic interactions, independent contributions of TFBSs, and negative homotypic interactions. Our findings explain observed patterns of activity among genomic CRX-bound enhancers and silencers, and suggest that enhancers may require diverse TFBSs to overcome negative homotypic interactions between TFBSs

    Taking into account nucleosomes for predicting gene expression

    Get PDF
    The eukaryotic genome is organized in a chain of nucleosomes that consist of 145-147. bp of DNA wrapped around a histone octamer protein core. Binding of transcription factors (TF) to nucleosomal DNA is frequently impeded, which makes it a challenging task to calculate TF occupancy at a given regulatory genomic site for predicting gene expression. Here, we review methods to calculate TF binding to DNA in the presence of nucleosomes. The main theoretical problems are (i) the computation speed that is becoming a bottleneck when partial unwrapping of DNA from the nucleosome is considered, (ii) the perturbation of the binding equilibrium by the activity of ATP-dependent chromatin remodelers, which translocate nucleosomes along the DNA, and (iii) the model parameterization from high-throughput sequencing data and fluorescence microscopy experiments in living cells. We discuss strategies that address these issues to efficiently compute transcription factor binding in chromatin. © 2013 Elsevier Inc

    Intrinsic limits to gene regulation by global crosstalk

    Get PDF
    Gene regulation relies on the specificity of transcription factor (TF) - DNA interactions. In equilibrium, limited specificity may lead to crosstalk: a regulatory state in which a gene is either incorrectly activated due to noncognate TF-DNA interactions or remains erroneously inactive. We present a tractable biophysical model of global crosstalk, where many genes are simultaneously regulated by many TFs. We show that in the simplest regulatory scenario, a lower bound on crosstalk severity can be analytically derived solely from the number of (co)regulated genes and a suitable parameter that describes binding site similarity. Estimates show that crosstalk could present a significant challenge for organisms with low-specificity TFs, such as metazoans, unless they use appropriate regulation schemes. Strong cooperativity substantially decreases crosstalk, while joint regulation by activators and repressors, surprisingly, does not; moreover, certain microscopic details about promoter architecture emerge as globally important determinants of crosstalk strength. Our results suggest that crosstalk imposes a new type of global constraint on the functioning and evolution of regulatory networks, which is qualitatively distinct from the known constraints acting at the level of individual gene regulatory elements
    corecore