14 research outputs found
Modeling nucleosome mediated mechanisms of gene regulation
The genomes of all eukaryotic organisms are packaged into nucleosomes, which are the fundamental units of chromatin, each composed of approximately 147 base pairs of DNA wrapped around a histone octamer. Because 70-90% of the eukaryotic genome is packaged into nucleosomes they modulate accessibility of DNA to transcription factors (TFs) and play an important role in regulation of transcription.
This thesis is devoted to the mathematical modeling of effects which are caused by direct competition between nucleosomes and transcription factors. The contents of the thesis are organized as follows: in chapter 1 we introduce experimental methods and recent discoveries which have been made in chromatin biology. In chapter 2 we introduce a thermodynamic biophysical model for calculating nucleosome and transcription factor occupancies. We also introduce the statistical positioning effect and how it may affect the binding of transcription factors. In chapter 2 we mostly address a question of how competition with transcription factors can affect nucleosome positioning. We first examine nucleosome experimental data and address the question of reproducibility of the data across different experiments carried out in several labs. Then, we introduce a new method for the quality assessment of the prediction of the model and use it to optimize parameters of the model to fit experimental data. We focus on how transcription factors can explain observed in vivo nucleosome positioning and which transcription factors play crucial roles in establishing nucleosome patterns at the promoters of genes.
In chapter 3 we address a question of how nucleosomes and promoter architecture affect binding of TFs. We model binding of TFs in the context of chromatin to a cluster of binding sites and investigate what features of the binding site cluster determine the main characteristics of TF binding.
Finally, we study how TFBSs in real genomes are positioned relative to each other and show that there are certain biases in spacing between TFBSs, probably due to effects caused by competition with nucleosomes
SwissRegulon, a database of genome-wide annotations of regulatory sites: recent updates
Identification of genomic regulatory elements is essential for understanding the dynamics of cellular processes. This task has been substantially facilitated by the availability of genome sequences for many species and high-throughput data of transcripts and transcription factor (TF) binding. However, rigorous computational methods are necessary to derive accurate genome-wide annotations of regulatory sites from such data. SwissRegulon (http://swissregulon.unibas.ch) is a database containing genome-wide annotations of regulatory motifs, promoters and TF binding sites (TFBSs) in promoter regions across model organisms. Its binding site predictions were obtained with rigorous Bayesian probabilistic methods that operate on orthologous regions from related genomes, and use explicit evolutionary models to assess the evidence of purifying selection on each site. New in the current version of SwissRegulon is a curated collection of 190 mammalian regulatory motifs associated with âź340 TFs, and TFBS annotations across a curated set of âź35 000 promoters in both human and mouse. Predictions of TFBSs for Saccharomyces cerevisiae have also been significantly extended and now cover 158 of yeast's âź180 TFs. All data are accessible through both an easily navigable genome browser with search functions, and as flat files that can be downloaded for further analysi
Predicted and observed nucleosome profiles around 5Ⲡand 3Ⲡends of genes.
<p><b>A</b>: Averaged nucleosome coverage near transcription starts. Each curve shows the average nucleosome coverage at different positions relative to transcription start averaged over all genes. Red dashed lines correspond to experimentally measured nucleosome coverage (data from <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003181#pcbi.1003181-Lee1" target="_blank">[1]</a>, right vertical axis). The solid lines correspond to the predicted nucleosome coverage by the model including only nucleosomes (light green) and the model including all TFs (blue), left vertical axis. <b>B</b>: Averaged nucleosome coverage near transcription ends. Curves are as described for panel A.</p
Performance of models that include only nucleosome sequence specificity.
<p><b>A:</b> Fraction of information regarding experimentally annotated linker and nucleosome positions explained by the nucleosome-only model (quality score, vertical bars) as a function of relative nucleosome specificity. The relative nucleosome specificity is controlled by the scale factor , where corresponds to the sequence specificity of the model of Kaplan et al. <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003181#pcbi.1003181-Kaplan1" target="_blank">[18]</a>, for which the binding energy of the nucleosomes has a standard-deviation of across the genome. The error-bars indicate standard-errors across separate test sets. <b>B:</b> Experimentally observed cumulative distribution of nucleosome coverages (fraction of time a given genomic position is covered by a nucleosome) from <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003181#pcbi.1003181-Lee1" target="_blank">[1]</a> (red dotted line) and cumulative distributions of predicted nucleosome coverage of the models of <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003181#pcbi.1003181-Kaplan1" target="_blank">[18]</a> (dark green line) and our model using nucleosome specificity scale parameters of (black line), (blue line), and (light green line).</p
Reproducibility of <i>in vitro</i> and <i>in vivo</i> nucleosome data across different experiments and performance of nucleosome sequence-specificity models.
<p><b>A:</b> Pearson correlation coefficients of the per-base nucleosome coverage between various experimental data-sets measuring nucleosome occupancy either <i>in vivo </i><a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003181#pcbi.1003181-Lee1" target="_blank">[1]</a>, <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003181#pcbi.1003181-Shivaswamy1" target="_blank">[3]</a>, <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003181#pcbi.1003181-Kaplan1" target="_blank">[18]</a>, <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003181#pcbi.1003181-Field1" target="_blank">[38]</a>, <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003181#pcbi.1003181-Mavrich3" target="_blank">[56]</a> or <i>in vitro </i><a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003181#pcbi.1003181-Kaplan1" target="_blank">[18]</a>, <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003181#pcbi.1003181-Zhang1" target="_blank">[19]</a>, <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003181#pcbi.1003181-Zhang2" target="_blank">[58]</a>, and predictions from a number of models of nucleosome sequence-specificity <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003181#pcbi.1003181-Kaplan1" target="_blank">[18]</a>, <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003181#pcbi.1003181-Locke1" target="_blank">[24]</a>. <b>B:</b> Reproducibility of annotated nucleosome positions across the <i>in vivo</i> data-sets. For each annotated nucleosome in the reference map of <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003181#pcbi.1003181-Jiang1" target="_blank">[41]</a>, we calculated the standard deviation in the annotated positions of the corresponding nucleosomes across the data-sets used to construct the map. The blue curve shows the distribution of standard deviations across nucleosomes. The grey dotted curve shows the analogous distribution that is obtained using randomized data (see <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003181#s4" target="_blank">Materials and Methods</a>). The high reproducibility of nucleosome positions across different data-sets justifies the use of binary data, i.e. positions of âlinkersâ and ânucleosomesâ, instead of Pearson correlation for evaluation of the performance of computational models for predicting nucleosome positions.</p
Only approximately TFs contribute significantly to nucleosome positioning.
<p><b>A</b>: For each TF an average quality score across test-sets was determined using the model containing nucleosomes and the corresponding TF. TFs were then ordered by the -statistic , with the quality score of the model without any TFs, and the standard-error across the test-sets (see <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003181#s4" target="_blank">Materials and Methods</a>). The panel shows the reverse cumulative distribution of -statistics observed across the TFs (blue dots) together with the expected standard-normal distribution expected for random predictions (brown dotted curve). Note that about TFs have -statistics larger than expected by chance. The green dots show the reverse-cumulatives of -statistics for the fits obtained with WMs in which the columns of each WM have been randomly shuffled. The red dots show the reverse-cumulatives of -statistics obtained when fitting the original WMs to the <i>in vitro</i> map of nucleosome positions. Note that both the green and red dots closely follow the distribution expected by chance. <b>B</b>: The top 20 TFs that contribute most to <i>in vivo</i> nucleosome positioning sorted by their -statistic. The bars show the average quality score and standard-error for each TF.</p
Illustration of example configurations of proteins bound to DNA.
<p>The top line indicates contributions from the individual binding sites to the overall probability of the configuration. Note that for illustration purposes, the sizes of TFs and nucleosomes are not shown to scale, e.g. the sizes of nucleosome footprints are much larger in reality.</p
Illustration of the measured nucleosome occupancy and model predictions within individual genomic regions.
<p>Each panel shows a section of the yeast genome within our genome browser (swissregulon.unibas.ch/ozonov), with the tracks corresponding to, from top to bottom, chromosomal location, annotated genes, the measured nucleosome coverage based on the data from <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003181#pcbi.1003181-Lee1" target="_blank">[1]</a>, the predicted nucleosome coverage using the model without TFs, the predicted nucleosome coverage using the model including TFs, and the total predicted TF coverage, i.e. summing over all TFs. Within the genome browser the coverage of individual TFs can also be displayed.</p
Statistical analysis of protein-protein interactions between TFs and chromatin remodeling complexes, histone modification enzymes, and histones.
<p>For all yeast TFs we counted the number of âlinksâ, i.e. known direct protein-protein interactions, with proteins from the functional categories shown in the first column. The second column shows the total number of links with all TFs, and the third column the number of links with the top TFs that most significantly explain nucleosome positioning. The fourth column shows the -value for the enrichment of links among the top TFs using a hypergeometric test, and the column shows the fold enrichment.</p
SUMO ylated PRC 1 controls histone H3.3 deposition and genome integrity of embryonic heterochromatin
Chromatin integrity is essential for cellular homeostasis. Polycombgroup proteins modulate chromatin states and transcriptionallyrepress developmental genes to maintain cell identity. They alsorepress repetitive sequences such as major satellites and consti-tute an alternative state of pericentromeric constitutive hete-rochromatin at paternal chromosomes (pat-PCH) in mouse pre-implantation embryos. Remarkably, pat-PCH contains the histoneH3.3 variant, which is absent from canonical PCH at maternal chro-mosomes, which is marked by histone H3 lysine 9 trimethylation(H3K9me3), HP1, and ATRX proteins. Here, we show that SUMO2-modified CBX2-containing Polycomb Repressive Complex 1 (PRC1)recruits the H3.3-specific chaperone DAXX to pat-PCH, enablingH3.3 incorporation at these loci. Deficiency of Daxx or PRC1 compo-nents Ring1 and Rnf2 abrogates H3.3 incorporation, induces chro-matin decompaction and breakage at PCH of exclusively paternalchromosomes, and causes their mis-segregation. Complementationassays show that DAXX-mediated H3.3 deposition is required forchromosome stability in early embryos. DAXX also regulates repres-sion of PRC1 target genes during oogenesis and early embryogene-sis. The study identifies a novel critical role for Polycomb inensuring heterochromatin integrity and chromosome stability inmouse early development