44 research outputs found
Binding of high mobility group A proteins to the mammalian genome occurs as a function of AT-content
<div><p>Genomic location can inform on potential function and recruitment signals for chromatin-associated proteins. High mobility group (Hmg) proteins are of similar size as histones with Hmga1 and Hmga2 being particularly abundant in replicating normal tissues and in cancerous cells. While several roles for Hmga proteins have been proposed we lack a comprehensive description of their genomic location as a function of chromatin, DNA sequence and functional domains. Here we report such a characterization in mouse embryonic stem cells in which we introduce biotin-tagged constructs of wild-type and DNA-binding domain mutants. Comparative analysis of the genome-wide distribution of Hmga proteins reveals pervasive binding, a feature that critically depends on a functional DNA-binding domain and which is shared by both Hmga proteins. Assessment of the underlying queues instructive for this binding modality identifies AT richness, defined as high frequency of A or T bases, as the major criterion for local binding. Additionally, we show that other chromatin states such as those linked to cis-regulatory regions have little impact on Hmga binding both in stem and differentiated cells. As a consequence, Hmga proteins are preferentially found at AT-rich regions such as constitutively heterochromatic regions but are absent from enhancers and promoters arguing for a limited role in regulating individual genes. In line with this model, we show that genetic deletion of Hmga proteins in stem cells causes limited transcriptional effects and that binding is conserved in neuronal progenitors. Overall our comparative study describing the <i>in vivo</i> binding modality of Hmga1 and Hmga2 identifies the proteins’ preference for AT-rich DNA genome-wide and argues against a suggested function of Hmga at regulatory regions. Instead we discover pervasive binding with enrichment at regions of higher AT content irrespective of local variation in chromatin modifications.</p></div
Genomic location of Hmga1-2 versus DBD-mutant controls.
<p>(A) Biotin-tagged versions of Hmga proteins driven by a strong, ubiquitously active promoter were inserted into a defined genomic locus. DBDs of Hmga1-2 are depicted as boxes. Mutations in the DBD of Hmga1-2 where targeted to the core RGR motif of the three AT-hooks. A monomeric GFP control was tagged and inserted in a similar way. The N-terminal biotin tag is recognized by the BirA biotin ligase, which the cell line used stably expresses. Subsequent streptavidin (SAV) mediated Chromatin-IP followed by sequencing was used to generate antibody-independent genomic maps. Functional mutants were similarly expressed after insertion into the same genomic location. (B) Top, table shows read counts per kilobase and million mapped reads (RPKM) for Hmga1, Hmga2 and two control genes. To account for an Hmga1 pseudogene, mapping was performed allowing 20 multiple alignments and reported values are likely an underestimation of actual expression levels. Hmga2 is not expressed in ES cells. Bottom, Western blotting (WB) with anti Hmga1 Ab of whole cell lysate from parental cell line and cells expressing Hmga1. A higher molecular weight band representing bioHmga1 is visible and shows an expression level comparable to the endogenous protein NP stands for neuronal progenitors. (C) Subcellular localization of endogenous Hmga1, top set, bioHmga proteins, middle set, and bioGFP, bottom set, assessed by immunofluorescence. Nuclei and DNA were stained with DAPI. DAPI-dense foci are positive for both WT and tagged Hmga proteins detected with a specific antibody and with SAV-coupled fluorophores, respectively. The subcellular localization of tagged GFP by GFP-channel acquisition is also depicted. GFP stains evenly in ESC and is not excluded from nuclei. Scale-bars in the DAPI channel corresponds to 10 μm. (D) Log2 enrichments over input in 1 kb tiling windows of DBD mutant and WT Hmga1-2 as well as GFP were subjected to Principal Component Analysis (PCA). Barplot shows fraction of the total variance explained by each principal component. The first principal component (PC1) alone explains almost 50% of the variance. (E) PC1 scores of each sample. PC1 separates the samples into those corresponding to proteins with a WT DNA-binding domain and those with either a mutated or no DBD. (F) Scatterplot and Pearson correlation of the PC1 loading with AT content.</p
Genomic distribution of Hmga-enriched regions and AT-rich DNA.
<p>(A) Average profiles of log2 enrichment values over DBD-mutant at LMR regulatory regions. Shown are ESC and NP signal for replicate “a” of Hmga1. Average signal (smoothed over 51 nts) is shown over a 4 kb window centered at ESC-, NP-specific and constitutive LMR midpoints, shown in red, green and black respectively. This reveals lack of Hmga1 enrichment at both constitutive and cell-type specific regulatory regions. (B) Same as in (A) for replicate “a” of Hmga2. A depletion rather than an enrichment is observed at the indicated regulatory regions. (C) Chromosome-wide profiles of the indicated genomic and epigenomic features in ESC. Each datapoint represents the signal over a 10kb tiling window (replication timing = mean late/early S-phase ratios, Lmna = DamID LaminA, Hmga = input-normalized enrichment over DBDmutant, H3K9me2 = H3K9me2 enrichment over input). (D) Genome-wide correlation heatmap of replication timing, LaminA, H3K9me2, DNaseI cut frequency, AT content and Hmga1-2 as in <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1007102#pgen.1007102.g001" target="_blank">Fig 1C</a> (10kb tiling windows, colors indicate the Pearson correlation coefficient). (E) Linear model of Hmga protein binding at 1 kb windows based on the AT content of the window itself and the 3 neighboring ones, both upstream and downstream. Plotted are the values of the coefficients for each spatial position grouped by sample. The AT content of the window itself has by far the largest coefficients and contributions from neighbouring windows lead to negligible improvements in predictive power (cf <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1007102#pgen.1007102.s008" target="_blank">S7E Fig</a>).</p
Invariance of Hmga1-2 binding in different chromatin environments and in neuronal progenitor cells.
<p>(A) Correlation heatmap of a representative Hmga1-2 replicate versus chromatin marks and AT content. Hmga1 and Hmga2 form a separate cluster with AT content and are weakly anti-correlated with openness (as assessed by DNaseI). Colours indicate the Pearson correlation coefficient. (B) Boxplots showing the distribution of Hmga1 and Hmga2 signal (log2 enrichments normalized over the DBD-mutant) for the indicated ESC replicates over promoters, separated by promoter activity (see <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1007102#sec009" target="_blank">Materials and Methods</a>). (C) Barplot showing genome-wide correlations for Hmga1 and Hmga2 replicates in ESC and NP (log2 enrichments over DBD-mutant) with genomic AT content. (D) Log2 enrichments over DBD-mutant for the indicated samples along chromosome 14. Each data-point corresponds to a 10kb tiling window. For better readability, top and bottom 1% of data range are not shown. NP profiles appear very similar to ES-derived profiles (also see <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1007102#pgen.1007102.s006" target="_blank">S5F Fig</a>).</p
Hmga1 deletion in ESC does not affect transcription globally.
<p>(A) Clonogenicity assay of Hmga1 KO and parental cell line. Triplicate biological replicate counts of pluripotent clones out of the indicated number of single cells plated (in brackets). No significant change is observed (one-way ANOVA, CI 95%). (B) Two-way ANOVA analysis of ESC cell cycle distribution data (n = 3) exclude an overall difference between WT and Hmga1 KO samples. Only in S phase a barely significant difference can be seen (adj. p-value = 0.0107). For details, see <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1007102#sec009" target="_blank">Materials and methods</a>. Y-axis denotes percentage of cells within a gate. (C) Cell proliferation data of a WT and Hmga1 KO sample during a 4-day time course. The distribution of actively replicating cells in the WT and Hmga1 KO samples are superimposable at any given time point. For colour reference and details refer to <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1007102#pgen.1007102.s011" target="_blank">S10C Fig</a> and Materials and methods. (D) Transcriptomic comparison of Hmga1 KO vs. parental cell line at the gene level. Gene names are indicated if the gene is significantly differentially expressed (adjusted p-value < 0.01 and absolute fold-change of at least 2). (E) Transcriptomic comparison of Hmga1 KO vs. parental cell line at repetitive regions of the genome as defined by RepeatMasker, excluding repeats lying on the same or opposite strand of annotated transcripts (Materials and methods). Repeat elements show no significant changes (adjusted p-value < 0.01 and absolute fold-change of at least 2). Quantification was performed on the level of RepeatMasker repeat “names”.</p
Hmga proteins bind to DNA in ESC as a function of DNA AT content.
<p>(A) Genome-wide correlation heatmap for all samples (including replicate c of Hmga1) and AT content on 1kb tiling windows, illustrating both good reproducibility between replicates and the correlation of Hmga1-2 with AT content. Colors indicate the Pearson correlation coefficient. (B) Log2 enrichments over input for the depicted samples on chromosome 14. Each dot represents the enrichment of IP over input in a window of size 10kb. Gaps indicate regions with low mappability (below 80%). Top and bottom 1% of data range are not shown to enhance readability. (C) Barplot illustrating the predictive power of mono-, di-, tri- and tetranucleotide models for each individual sample. The predictive power is not substantially improved by taking into account higher-order sequence features. (D) Relationship between bioHmga samples and AT content in ES cells for two representative replicates. Scatterplots depict AT content vs. log2 Hmga1-2 input-normalized enrichment values (over 1kb tiling windows) minus the same enrichments for the respective DBD-mutant. Pearson correlation coefficients are indicated on top. (E) Hmga binding at simple repeats with very high AT content. Average profiles show log2 enrichment over the respective DBD mutants at mappable (TA)n simple repeats of a minimal length of 300nts, centered at repeat start coordinates. AT content is shown in grey (dashed line).</p
5hmC enrichment at REST-bound LMRs is partially dependent on the presence of REST.
<p>(A) Relative methylation changes between REST wildtype and REST knockout ES cells are correlated to REST ChIP enrichment. Methylation was determined 200 bp around the REST motif at all REST sites overlapping with LMRs. The point density is colour-coded (red: high, blue: low point density). Methylation determined by BisSeq (B) and hMeDIP qPCR enrichments (C) at REST motif containing LMRs bound and not bound by REST in wildtype (wt, dark blue) and REST knockout (ko, blue) ES cells. Error bars in (C) represent standard deviation in three replicate experiments normalized to a positive control.</p
5hmC dynamics during differentiation occurs preferentially at LMRs.
<p>(A–B) Shown is the relative frequency of changes in 5hmC at LMRs and UMRs normalized for genome coverage at the ES (A) and NP state (B). The y-axis shows observed linear fold enrichment relative to expected enrichments (see <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1003994#s4" target="_blank">Materials and Methods</a>). Note that 5hmC is changing preferentially at cell-type specific LMRs.</p
Transcription Factor Occupancy Can Mediate Active Turnover of DNA Methylation at Regulatory Regions
<div><p>Distal regulatory elements, including enhancers, play a critical role in regulating gene activity. Transcription factor binding to these elements correlates with Low Methylated Regions (LMRs) in a process that is poorly understood. Here we ask whether and how actual occupancy of DNA-binding factors is linked to DNA methylation at the level of individual molecules. Using CTCF as an example, we observe that frequency of binding correlates with the likelihood of a demethylated state and sites of low occupancy display heterogeneous DNA methylation within the CTCF motif. In line with a dynamic model of binding and DNA methylation turnover, we find that 5-hydroxymethylcytosine (5hmC), formed as an intermediate state of active demethylation, is enriched at LMRs in stem and somatic cells. Moreover, a significant fraction of changes in 5hmC during differentiation occurs at these regions, suggesting that transcription factor activity could be a key driver for active demethylation. Since deletion of CTCF is lethal for embryonic stem cells, we used genetic deletion of REST as another DNA-binding factor implicated in LMR formation to test this hypothesis. The absence of REST leads to a decrease of hydroxymethylation and a concomitant increase of DNA methylation at its binding sites. These data support a model where DNA-binding factors can mediate turnover of DNA methylation as an integral part of maintenance and reprogramming of regulatory regions.</p></div
5hmC marks LMRs in a cell-type specific fashion.
<p>(A) Average profiles for methylation (WG-BisSeq), 5hmC (hMeDIP-seq) and TET1 occupancy at Fully Methylated, Unmethylated and Low Methylated Regions (FMRs, UMRs and LMRs, respectively) in ES cells. (B) DNA methylation (upper tracks) and enrichment of 5hmC (lower tracks) in ES cells and NP of representative ES-specific, constitutive and NP-specific LMRs. (C) Average profiles representing methylation (WG-BisSeq), hMeDIP-seq and H3K4me1 ChIP-Seq in ES cells and NP ±3 kb around the segment middle.</p