Inferring structural properties of protein-DNA binding using high-throughput sequencing. The paradigm of GATA1, KLF1 and their complexes GATA1/FOG1 and GATA1/KLF1. Insights into the transcriptional regulation of the erythroid cell lineage.

Abstract

GATA1 and KLF1 are transcription factors that regulate genes which are important for the development of erythroid cells. The GATA1 transcriptional co-factor FOG1 has been shown to be essential in a wide range of GATA1 dependent cellular functions. Here we tried to understand the diverse mechanisms by which GATA1 and KLF1 recognize their binding sites, how the GATA1 recognition mechanisms are affected by complexation with either FOG1 or KLF1 and how the GATA1 recognition mechanisms affect the transcriptional regulation of the erythroid differentiation. We profiled the DNA binding specificities/affinities of a GATA1 fragment (mGATA1NC), that contains only the two DNA binding domains (N-terminal and C-terminal Zn finger), and the DNA binding specificities/affinities of a KLF1 fragment (mKLF1257-358), that contains the three DNA binding domains, using a novel methodology that combines EMSA with high throughput sequencing (EMSA-seq (Wong et al., 2011a)). We also profiled the DNA binding specificities of the C-terminal Zn finger of GATA1 alone (mGATA1C), the wt-mGATA1, the wt-mGATA1/wt-mFOG1 complex and the mGATA1NC/mKLF1257-358 complex. At first, we confirmed that the N-terminal Zn finger of GATA1 has a strong preference for the “GATC” motif, whereas the C-terminal Zn finger of GATA1 has a strong preference for the “GATA” motif. Next, we found that in the mGATA1NC, both DNA binding domains can bind simultaneously a wide range of different positional combinations of the co-occurring “GATA” and “GATC” motifs, on the same DNA sequence. The wt-mGATA1 did not show the ability to bind in the same co-occurring motifs implying an effect of the non-DNA binding domains of the protein in the regulation of its DNA binding specificities. On the contrary, complexation of wt-mGATA1 with the wt-mFOG1 partially restored its ability to bind in a now limited range of different positional combinations of the co-occurring “GATA” and “GATC” motifs, on the same DNA sequence. Similar observations were made for other pairs of GATA1 N-terminal and C-terminal Zn finger specific motifs. We then projected the GATA1 DNA binding specificities/affinities in vivo and we classified the GATA1 ChIP-seq peaks in low, medium or high affinity based on the number of the GATA1 motifs. We noticed that high affinity GATA1 ChIP-seq peaks tend to appear in regions with low nucleosome occupancy. We also noticed that GATA1 ChIP-seq peaks in the enhancer regions are usually high affinity whereas GATA1 ChIP-seq peaks in the proximal promoter regions are usually low affinity. Additionally, we observed that high affinity GATA1 ChIP-seq peaks are usually found in regions with increased levels of H3K4me2 and are associated with a higher decrease in the H3K4me3 levels on the TSS of the nearby genes. None of these GATA1 related in vivo observations were found for the KLF1 ChIP-seq positions. These findings significantly advance our understanding of the DNA binding properties of GATA1, KLF1 and their complexes and give an insight on the importance of the GATA1 DNA binding affinities in the regulation of the erythroid transcriptional program. Overall the work establishes an experimental and analytical framework to investigate how transcriptional co-factors can change the DNA binding specificities of specific transcription factors and how integration of the transcription factor DNA binding affinities with in vivo data can give novel insights into the transcriptional regulation.This thesis is not currently available in ORA

    Similar works

    Full text

    thumbnail-image

    Available Versions