22 research outputs found

    Nuclear factor I genomic binding associates with chromatin boundaries

    Get PDF
    BACKGROUND: The Nuclear Factor I (NFI) family of DNA binding proteins (also called CCAAT box transcription factors or CTF) is involved in both DNA replication and gene expression regulation. Using chromatin immuno-precipitation and high throughput sequencing (ChIP-Seq), we performed a genome-wide mapping of NFI DNA binding sites in primary mouse embryonic fibroblasts. RESULTS: We found that in vivo and in vitro NFI DNA binding specificities are indistinguishable, as in vivo ChIP-Seq NFI binding sites matched predictions based on previously established position weight matrix models of its in vitro binding specificity. Combining ChIP-Seq with mRNA profiling data, we found that NFI preferentially associates with highly expressed genes that it up-regulates, while binding sites were under-represented at expressed but unregulated genes. Genomic binding also correlated with markers of transcribed genes such as histone modifications H3K4me3 and H3K36me3, even outside of annotated transcribed loci, implying NFI in the control of the deposition of these modifications. Positional correlation between + and - strand ChIP-Seq tags revealed that, in contrast to other transcription factors, NFI associates with a nucleosomal length of cleavage-resistant DNA, suggesting an interaction with positioned nucleosomes. In addition, NFI binding prominently occurred at boundaries displaying discontinuities in histone modifications specific of expressed and silent chromatin, such as loci submitted to parental allele-specific imprinted expression. CONCLUSIONS: Our data thus suggest that NFI nucleosomal interaction may contribute to the partitioning of distinct chromatin domains and to epigenetic gene expression regulation. NFI ChIP-Seq and input control DNA data were deposited at Gene Expression Omnibus (GEO) repository under accession number GSE15844. Gene expression microarray data for mouse embryonic fibroblasts are on GEO accession number GSE15871

    Design and analysis of ChIP-Seq experiments for nuclear factor I DNA-binding proteins

    No full text
    SUMMARY : Eukaryotic DNA interacts with the nuclear proteins using non-covalent ionic interactions. Proteins can recognize specific nucleotide sequences based on the sterical interactions with the DNA and these specific protein-DNA interactions are the basis for many nuclear processes, e.g. gene transcription, chromosomal replication, and recombination. New technology termed ChIP-Seq has been recently developed for the analysis of protein-DNA interactions on a whole genome scale and it is based on immunoprecipitation of chromatin and high-throughput DNA sequencing procedure. ChIP-Seq is a novel technique with a great potential to replace older techniques for mapping of protein-DNA interactions. In this thesis, we bring some new insights into the ChIP-Seq data analysis. First, we point out to some common and so far unknown artifacts of the method. Sequence tag distribution in the genome does not follow uniform distribution and we have found extreme hot-spots of tag accumulation over specific loci in the human and mouse genomes. These artifactual sequence tags accumulations will create false peaks in every ChIP-Seq dataset and we propose different filtering methods to reduce the number of false positives. Next, we propose random sampling as a powerful analytical tool in the ChIP-Seq data analysis that could be used to infer biological knowledge from the massive ChIP-Seq datasets. We created unbiased random sampling algorithm and we used this methodology to reveal some of the important biological properties of Nuclear Factor I DNA binding proteins. Finally, by analyzing the ChIP-Seq data in detail, we revealed that Nuclear Factor I transcription factors mainly act as activators of transcription, and that they are associated with specific chromatin modifications that are markers of open chromatin. We speculate that NFI factors only interact with the DNA wrapped around the nucleosome. We also found multiple loci that indicate possible chromatin barrier activity of NFI proteins, which could suggest the use of NFI binding sequences as chromatin insulators in biotechnology applications. RESUME : L'ADN des eucaryotes interagit avec les protéines nucléaires par des interactions noncovalentes ioniques. Les protéines peuvent reconnaître les séquences nucléotidiques spécifiques basées sur l'interaction stérique avec l'ADN, et des interactions spécifiques contrôlent de nombreux processus nucléaire, p.ex. transcription du gène, la réplication chromosomique, et la recombinaison. Une nouvelle technologie appelée ChIP-Seq a été récemment développée pour l'analyse des interactions protéine-ADN à l'échelle du génome entier et cette approche est basée sur l'immuno-précipitation de la chromatine et sur la procédure de séquençage de l'ADN à haut débit. La nouvelle approche ChIP-Seq a donc un fort potentiel pour remplacer les anciennes techniques de cartographie des interactions protéine-ADN. Dans cette thèse, nous apportons de nouvelles perspectives dans l'analyse des données ChIP-Seq. Tout d'abord, nous avons identifié des artefacts très communs associés à cette méthode qui étaient jusqu'à présent insoupçonnés. La distribution des séquences dans le génome ne suit pas une distribution uniforme et nous avons constaté des positions extrêmes d'accumulation de séquence à des régions spécifiques, des génomes humains et de la souris. Ces accumulations des séquences artéfactuelles créera de faux pics dans toutes les données ChIP-Seq, et nous proposons différentes méthodes de filtrage pour réduire le nombre de faux positifs. Ensuite, nous proposons un nouvel échantillonnage aléatoire comme un outil puissant d'analyse des données ChIP-Seq, ce qui pourraient augmenter l'acquisition de connaissances biologiques à partir des données ChIP-Seq. Nous avons créé un algorithme d'échantillonnage aléatoire et nous avons utilisé cette méthode pour révéler certaines des propriétés biologiques importantes de protéines liant à l'ADN nommés Facteur Nucléaire I (NFI). Enfin, en analysant en détail les données de ChIP-Seq pour la famille de facteurs de transcription nommés Facteur Nucléaire I, nous avons révélé que ces protéines agissent principalement comme des activateurs de transcription, et qu'elles sont associées à des modifications de la chromatine spécifiques qui sont des marqueurs de la chromatine ouverte. Nous pensons que lés facteurs NFI interagir uniquement avec l'ADN enroulé autour du nucléosome. Nous avons également constaté plusieurs régions génomiques qui indiquent une éventuelle activité de barrière chromatinienne des protéines NFI, ce qui pourrait suggérer l'utilisation de séquences de liaison NFI comme séquences isolatrices dans des applications de la biotechnologie

    Molecular characterization of a human matrix attachment region epigenetic regulator.

    Get PDF
    Matrix attachment regions (MAR) generally act as epigenetic regulatory sequences that increase gene expression, and they were proposed to partition chromosomes into loop-forming domains. However, their molecular mode of action remains poorly understood. Here, we assessed the possible contribution of the AT-rich core and adjacent transcription factor binding motifs to the transcription augmenting and anti-silencing effects of human MAR 1-68. Either flanking sequences together with the AT-rich core were required to obtain the full MAR effects. Shortened MAR derivatives retaining full MAR activity were constructed from combinations of the AT-rich sequence and multimerized transcription factor binding motifs, implying that both transcription factors and the AT-rich microsatellite sequence are required to mediate the MAR effect. Genomic analysis indicated that MAR AT-rich cores may be depleted of histones and enriched in RNA polymerase II, providing a molecular interpretation of their chromatin domain insulator and transcriptional augmentation activities

    Nuclear factor I revealed as family of promoter binding transcription activators.

    Get PDF
    ABSTRACT: BACKGROUND: Multiplex experimental assays coupled to computational predictions are being increasingly employed for the simultaneous analysis of many specimens at the genome scale, which quickly generates very large amounts of data. However, inferring valuable biological information from the comparisons of very large genomic datasets still represents an enormous challenge. RESULTS: As a study model, we chose the NFI/CTF family of mammalian transcription factors and we compared the results obtained from a genome-wide study of its binding sites with chromatin structure assays, gene expression microarray data, and in silico binding site predictions. We found that NFI/CTF family members preferentially bind their DNA target sites when they are located around transcription start sites when compared to control datasets generated from the random subsampling of the complete set of NFI binding sites. NFI proteins preferably associate with the upstream regions of genes that are highly expressed and that are enriched in active chromatin modifications such as H3K4me3 and H3K36me3. We postulate that this is a causal association and that NFI proteins mainly act as activators of transcription. This was documented for one member of the family (NFI-C), which revealed as a more potent gene activator than repressor in global gene expression analysis. Interestingly, we also discovered the association of NFI with the tri-methylation of lysine 9 of histone H3, a chromatin marker previously associated with the protection against silencing of telomeric genes by NFI. CONCLUSION: Taken together, we illustrate approaches that can be taken to analyze large genomic data, and provide evidence that NFI family members may act in conjunction with specific chromatin modifications to activate gene expressio

    Functional regulatory mechanism of smooth muscle cell-restricted LMOD1 coronary artery disease locus.

    No full text
    Recent genome-wide association studies (GWAS) have identified multiple new loci which appear to alter coronary artery disease (CAD) risk via arterial wall-specific mechanisms. One of the annotated genes encodes LMOD1 (Leiomodin 1), a member of the actin filament nucleator family that is highly enriched in smooth muscle-containing tissues such as the artery wall. However, it is still unknown whether LMOD1 is the causal gene at this locus and also how the associated variants alter LMOD1 expression/function and CAD risk. Using epigenomic profiling we recently identified a non-coding regulatory variant, rs34091558, which is in tight linkage disequilibrium (LD) with the lead CAD GWAS variant, rs2820315. Herein we demonstrate through expression quantitative trait loci (eQTL) and statistical fine-mapping in GTEx, STARNET, and human coronary artery smooth muscle cell (HCASMC) datasets, rs34091558 is the top regulatory variant for LMOD1 in vascular tissues. Position weight matrix (PWM) analyses identify the protective allele rs34091558-TA to form a conserved Forkhead box O3 (FOXO3) binding motif, which is disrupted by the risk allele rs34091558-A. FOXO3 chromatin immunoprecipitation and reporter assays show reduced FOXO3 binding and LMOD1 transcriptional activity by the risk allele, consistent with effects of FOXO3 downregulation on LMOD1. LMOD1 knockdown results in increased proliferation and migration and decreased cell contraction in HCASMC, and immunostaining in atherosclerotic lesions in the SMC lineage tracing reporter mouse support a key role for LMOD1 in maintaining the differentiated SMC phenotype. These results provide compelling functional evidence that genetic variation is associated with dysregulated LMOD1 expression/function in SMCs, together contributing to the heritable risk for CAD

    Integrative functional genomics identifies regulatory mechanisms at coronary artery disease loci

    No full text
    Coronary artery disease (CAD) is the leading cause of mortality and morbidity, driven by both genetic and environmental risk factors. Meta-analyses of genome-wide association studies have identified 4150 loci associated with CAD and myocardial infarction susceptibility in humans. A majority of these variants reside in non-coding regions and are co-inherited with hundreds of candidate regulatory variants, presenting a challenge to elucidate their functions. Herein, we use integrative genomic, epigenomic and transcriptomic profiling of perturbed human coronary artery smooth muscle cells and tissues to begin to identify causal regulatory variation and mechanisms responsible for CAD associations. Using these genome-wide maps, we prioritize 64 candidate variants and perform allele-specific binding and expression analyses at seven top candidate loci: 9p21.3, SMAD3, PDGFD, IL6R, BMP1, CCDC97/TGFB1 and LMOD1. We validate our findings in expression quantitative trait loci cohorts, which together reveal new links between CAD associations and regulatory function in the appropriate disease context
    corecore