2,586 research outputs found
Local sequence features that influence AP-1 cis-regulatory activity
In the genome, most occurrences of transcription factor binding sites (TFBS) have no cis-regulatory activity, which suggests that flanking sequences contain information that distinguishes functional from nonfunctional TFBS. We interrogated the role of flanking sequences near Activator Protein 1 (AP-1) binding sites that reside in DNase I Hypersensitive Sites (DHS) and regions annotated as Enhancers. In these regions, we found that sequence features directly adjacent to the core motif distinguish high from low activity AP-1 sites. Some nearby features are motifs for other TFs that genetically interact with the AP-1 site. Other features are extensions of the AP-1 core motif, which cause the extended sites to match motifs of multiple AP-1 binding proteins. Computational models trained on these data distinguish between sequences with high and low activity AP-1 sites and also predict changes in cis-regulatory activity due to mutations in AP-1 core sites and their flanking sequences. Our results suggest that extended AP-1 binding sites, together with adjacent binding sites for additional TFs, encode part of the information that governs TFBS activity in the genome.</jats:p
PTRE-seq reveals mechanism and interactions of RNA binding proteins and miRNAs
A large number of RNA binding proteins (RBPs) and miRNAs bind to the 3′ untranslated regions of mRNA, but methods to dissect their function and interactions are lacking. Here the authors introduce post-transcriptional regulatory element sequencing (PTRE-seq) to dissect sequence preferences, interactions and consequences of RBP and miRNA binding
Single nucleotide variants in transcription factors associate more tightly with phenotype than with gene expression
Mapping the polymorphisms responsible for variation in gene expression, known as Expression Quantitative Trait Loci (eQTL), is a common strategy for investigating the molecular basis of disease. Despite numerous eQTL studies, the relationship between the explanatory power of variants on gene expression versus their power to explain ultimate phenotypes remains to be clarified. We addressed this question using four naturally occurring Quantitative Trait Nucleotides (QTN) in three transcription factors that affect sporulation efficiency in wild strains of the yeast, Saccharomyces cerevisiae. We compared the ability of these QTN to explain the variation in both gene expression and sporulation efficiency. We find that the amount of gene expression variation explained by the sporulation QTN is not predictive of the amount of phenotypic variation explained. The QTN are responsible for 98% of the phenotypic variation in our strains but the median gene expression variation explained is only 49%. The alleles that are responsible for most of the variation in sporulation efficiency do not explain most of the variation in gene expression. The balance between the main effects and gene-gene interactions on gene expression variation is not the same as on sporulation efficiency. Finally, we show that nucleotide variants in the same transcription factor explain the expression variation of different sets of target genes depending on whether the variant alters the level or activity of the transcription factor. Our results suggest that a subset of gene expression changes may be more predictive of ultimate phenotypes than the number of genes affected or the total fraction of variation in gene expression variation explained by causative variants, and that the downstream phenotype is buffered against variation in the gene expression network
Causal variation in yeast sporulation tends to reside in a pathway bottleneck
There has been extensive debate over whether certain classes of genes are more likely than others to contain the causal variants responsible for phenotypic differences in complex traits between individuals. One hypothesis states that input/output genes positioned in signal transduction bottlenecks are more likely than other genes to contain causal natural variation. The IME1 gene resides at such a signaling bottleneck in the yeast sporulation pathway, suggesting that it may be more likely to contain causal variation than other genes in the sporulation pathway. Through crosses between natural isolates of yeast, we demonstrate that the specific causal nucleotides responsible for differences in sporulation efficiencies reside not only in IME1 but also in the genes that surround IME1 in the signaling pathway, including RME1, RSF1, RIM15, and RIM101. Our results support the hypothesis that genes at the critical decision making points in signaling cascades will be enriched for causal variants responsible for phenotypic differences
Discrimination between thermodynamic models of cis-regulation using transcription factor occupancy data
Many studies have identified binding preferences for transcription factors (TFs), but few have yielded predictive models of how combinations of transcription factor binding sites generate specific levels of gene expression. Synthetic promoters have emerged as powerful tools for generating quantitative data to parameterize models of combinatorial cis-regulation. We sought to improve the accuracy of such models by quantifying the occupancy of TFs on synthetic promoters in vivo and incorporating these data into statistical thermodynamic models of cis-regulation. Using chromatin immunoprecipitation-seq, we measured the occupancy of Gcn4 and Cbf1 in synthetic promoter libraries composed of binding sites for Gcn4, Cbf1, Met31/Met32 and Nrg1. We measured the occupancy of these two TFs and the expression levels of all promoters in two growth conditions. Models parameterized using only expression data predicted expression but failed to identify several interactions between TFs. In contrast, models parameterized with occupancy and expression data predicted expression data, and also revealed Gcn4 self-cooperativity and a negative interaction between Gcn4 and Nrg1. Occupancy data also allowed us to distinguish between competing regulatory mechanisms for the factor Gcn4. Our framework for combining occupancy and expression data produces predictive models that better reflect the mechanisms underlying combinatorial cis-regulation of gene expression
A quantitative metric of pioneer activity reveals that HNF4A has stronger in vivo pioneer activity than FOXA1
BACKGROUND: We and others have suggested that pioneer activity - a transcription factor\u27s (TF\u27s) ability to bind and open inaccessible loci - is not a qualitative trait limited to a select class of pioneer TFs. We hypothesize that most TFs display pioneering activity that depends on the TF concentration and the motif content at their target loci.
RESULTS: Here, we present a quantitative in vivo measure of pioneer activity that captures the relative difference in a TF\u27s ability to bind accessible versus inaccessible DNA. The metric is based on experiments that use CUT&Tag to measure the binding of doxycycline-inducible TFs. For each location across the genome, we determine the concentration of doxycycline required for a TF to reach half-maximal occupancy; lower concentrations reflect higher affinity. We propose that the relative difference in a TF\u27s affinity between ATAC-seq labeled accessible and inaccessible binding sites is a measure of its pioneer activity. We estimate binding affinities at tens of thousands of genomic loci for the endodermal TFs FOXA1 and HNF4A and show that HNF4A has stronger pioneer activity than FOXA1. We show that both FOXA1 and HNF4A display higher binding affinity at inaccessible sites with more copies of their respective motifs. The quantitative analysis of binding suggests different modes of binding for FOXA1, including an anti-cooperative mode of binding at certain accessible loci.
CONCLUSIONS: Our results suggest that relative binding affinities are reasonable measures of pioneer activity and support the model wherein most TFs have some degree of context-dependent pioneer activity
Drosophila Cdi4 is a p21/p27/p57-like cyclin-dependent kinase inhibitor with specificity for cyclin E complexes.
The eukaryotic cell cycle is controlled by a network of interacting regulatory proteins. We used an interaction mating two-hybrid assay to identify connections within the cell cycle regulatory network in Drosophila. We tested interactions between Drosophila cyclins and a panel of hundreds of previously identified proteins. One of the connections we identified was the interaction between cyclin E and a novel Drosophila protein, Cdi4. Because Cdi4 was originally identified by its ability to interact with a Drosophila cyclin-dependent kinase, the finding that it interacts with cyclin E strengthened the notion that it functions in cell cycle regulation. We show that Cdi4 can inhibit cyclin E function both in a yeast assay and in vitro. In light of these results, our sequence analysis revealed that Cdi4 is a unique member of the p21/p27/p57 family of Cdk inhibitors. Our results demonstrate that interaction mating assays using large informative panels of proteins can aid the analysis of regulatory networks by generating and constraining hypotheses that guide further work
Phylogeny based discovery of regulatory elements
BACKGROUND: Algorithms that locate evolutionarily conserved sequences have become powerful tools for finding functional DNA elements, including transcription factor binding sites; however, most methods do not take advantage of an explicit model for the constrained evolution of functional DNA sequences. RESULTS: We developed a probabilistic framework that combines an HKY85 model, which assigns probabilities to different base substitutions between species, and weight matrix models of transcription factor binding sites, which describe the probabilities of observing particular nucleotides at specific positions in the binding site. The method incorporates the phylogenies of the species under consideration and takes into account the position specific variation of transcription factor binding sites. Using our framework we assessed the suitability of alignments of genomic sequences from commonly used species as substrates for comparative genomic approaches to regulatory motif finding. We then applied this technique to Saccharomyces cerevisiae and related species by examining all possible six base pair DNA sequences (hexamers) and identifying sequences that are conserved in a significant number of promoters. By combining similar conserved hexamers we reconstructed known cis-regulatory motifs and made predictions of previously unidentified motifs. We tested one prediction experimentally, finding it to be a regulatory element involved in the transcriptional response to glucose. CONCLUSION: The experimental validation of a regulatory element prediction missed by other large-scale motif finding studies demonstrates that our approach is a useful addition to the current suite of tools for finding regulatory motifs
- …