18 research outputs found

    TATA is a modular component of synthetic promoters

    Get PDF
    The expression of most genes is regulated by multiple transcription factors. The interactions between transcription factors produce complex patterns of gene expression that are not always obvious from the arrangement of cis-regulatory elements in a promoter. One critical element of promoters is the TATA box, the docking site for the RNA polymerase holoenzyme. Using a synthetic promoter system coupled to a thermodynamic model of combinatorial regulation, we analyze the effects of different strength TATA boxes on various aspects of combinatorial cis-regulation. The thermodynamic model explains 75% of the variance in gene expression in synthetic promoter libraries with different strength TATA boxes, suggesting that many of the salient aspects of cis-regulation are captured by the model. Our results demonstrate that the effect of changing the TATA box on gene expression is the same for all synthetic promoters regardless of the arrangement of cis-regulatory sites we studied. Our analysis also showed that in our synthetic system the strength of the RNA polymeraseā€“TATA interaction does not alter the combinatorial interactions between transcription factors, or between transcription factors and RNA polymerase. Finally, we show that although stronger TATA boxes increase expression in a predictable fashion, stronger TATA boxes have very little effect on noise in our synthetic promoters, regardless of the arrangement of cis-regulatory sites. Our results support a modular model of promoter function, where cis-regulatory elements can be mixed and matched (programmed) with outcomes on expression that are predictable based on the rules of simple proteinā€“protein and proteinā€“DNA interactions

    Large-Scale Mapping and Validation of Escherichia coli Transcriptional Regulation from a Compendium of Expression Profiles

    Get PDF
    Machine learning approaches offer the potential to systematically identify transcriptional regulatory interactions from a compendium of microarray expression profiles. However, experimental validation of the performance of these methods at the genome scale has remained elusive. Here we assess the global performance of four existing classes of inference algorithms using 445 Escherichia coli Affymetrix arrays and 3,216 known E. coli regulatory interactions from RegulonDB. We also developed and applied the context likelihood of relatedness (CLR) algorithm, a novel extension of the relevance networks class of algorithms. CLR demonstrates an average precision gain of 36% relative to the next-best performing algorithm. At a 60% true positive rate, CLR identifies 1,079 regulatory interactions, of which 338 were in the previously known network and 741 were novel predictions. We tested the predicted interactions for three transcription factors with chromatin immunoprecipitation, confirming 21 novel interactions and verifying our RegulonDB-based performance estimates. CLR also identified a regulatory link providing central metabolic control of iron transport, which we confirmed with real-time quantitative PCR. The compendium of expression data compiled in this study, coupled with RegulonDB, provides a valuable model system for further improvement of network inference algorithms using experimental data

    Origin and consequences of the relationship between protein mean and variance.

    Get PDF
    Cell-to-cell variance in protein levels (noise) is a ubiquitous phenomenon that can increase fitness by generating phenotypic differences within clonal populations of cells. An important challenge is to identify the specific molecular events that control noise. This task is complicated by the strong dependence of a protein's cell-to-cell variance on its mean expression level through a power-law like relationship (Ļƒ2āˆĪ¼1.69). Here, we dissect the nature of this relationship using a stochastic model parameterized with experimentally measured values. This framework naturally recapitulates the power-law like relationship (Ļƒ2āˆĪ¼1.6) and accurately predicts protein variance across the yeast proteome (r2ā€Š=ā€Š0.935). Using this model we identified two distinct mechanisms by which protein variance can be increased. Variables that affect promoter activation, such as nucleosome positioning, increase protein variance by changing the exponent of the power-law relationship. In contrast, variables that affect processes downstream of promoter activation, such as mRNA and protein synthesis, increase protein variance in a mean-dependent manner following the power-law. We verified our findings experimentally using an inducible gene expression system in yeast. We conclude that the power-law-like relationship between noise and protein mean is due to the kinetics of promoter activation. Our results provide a framework for understanding how molecular processes shape stochastic variation across the genome

    Relationship between mean and variance in protein expression.

    No full text
    <p>a) Protein mean and variance values in <i>S. cerevisiae</i> plotted against each other in log-scale in arbitrary fluorescence, with corresponding Pearson's correlation coefficient. b) Distribution of residual variance values across the <i>S. cerevisiae</i> dataset. Red bars indicate residual variance value with Z-scores over 2 standard deviations from the mean.</p

    Analysis of mRNA distributions connects underlying promoter kinetics to nucleosome occupancy.

    No full text
    <p>a) mRNA mean and variance in <i>S. cerevisiae</i> plotted against each other in log-scale. Blue dashed line indicates the expected relationship between mean and variance in a regime of slow activation and fast inactivation rate (Ļƒ<sup>2</sup>ā€Š=ā€ŠĪ¼), red dashed line indicates expected relationship at slow promoter kinetics (Ļƒ<sup>2</sup>ā€Š=ā€ŠĪ¼+Ī¼<sup>2</sup>). Circles represent experimental values of mRNA mean and variance (color matches best fit to promoter kinetics regime) b) Average nucleosome occupancy between āˆ’600 to +1000 relative to the TSS of <i>S. cerevisiae</i> genes exhibiting linear mRNA mean-variance scaling. The position of the canonical nucleosome free region is indicated by the black arrow. c) Same as b) but with respect to <i>S. cerevisiae</i> genes exhibiting quadratic mRNA mean-variance scaling.</p

    Experimental Validation of Inferred Regulatory Interactions

    No full text
    <p>Global precision scores determined with RegulonDB for a set of 268 regulatory interactions were in good correspondence with the local precision scores determined via RegulonDB plus ChIP for three transcription factors. The blue bar indicates inferred interactions that are true positives based on RegulonDB and ChIP. The green bar shows the number of inferred interactions not in RegulonDB that were positive for ChIP, representing 21 new experimentally verified regulatory interactions. The red bar shows inferred interactions that are false positives based on RegulonDB and ChIP.</p
    corecore