2,007 research outputs found

    The Insulator Binding Protein CTCF Positions 20 Nucleosomes around Its Binding Sites across the Human Genome

    Get PDF
    Chromatin structure plays an important role in modulating the accessibility of genomic DNA to regulatory proteins in eukaryotic cells. We performed an integrative analysis on dozens of recent datasets generated by deep-sequencing and high-density tiling arrays, and we discovered an array of well-positioned nucleosomes flanking sites occupied by the insulator binding protein CTCF across the human genome. These nucleosomes are highly enriched for the histone variant H2A.Z and 11 histone modifications. The distances between the center positions of the neighboring nucleosomes are largely invariant, and we estimate them to be 185 bp on average. Surprisingly, subsets of nucleosomes that are enriched in different histone modifications vary greatly in the lengths of DNA protected from micrococcal nuclease cleavage (106–164 bp). The nucleosomes enriched in those histone modifications previously implicated to be correlated with active transcription tend to contain less protected DNA, indicating that these modifications are correlated with greater DNA accessibility. Another striking result obtained from our analysis is that nucleosomes flanking CTCF sites are much better positioned than those downstream of transcription start sites, the only genomic feature previously known to position nucleosomes genome-wide. This nucleosome-positioning phenomenon is not observed for other transcriptional factors for which we had genome-wide binding data. We suggest that binding of CTCF provides an anchor point for positioning nucleosomes, and chromatin remodeling is an important component of CTCF function

    Chromatin accessibility dynamics in the Arabidopsis root epidermis and endodermis during cold acclimation

    Get PDF
    Understanding cell-type specific transcriptional responses to environmental conditions is limited by a lack of knowledge of transcriptional control due to epigenetic dynamics. Additionally, cell-type analyses are limited by difficulties in applying current technologies to single cell-types. A novel DNase-seq protocol and analysis procedure, deemed DNase-DTS, was developed to identify DHSs in the Arabidopsis epidermis and endodermis under control and cold acclimation conditions. Results identified thousands of DHSs within each cell-type and experimental condition. DHSs showed strong association to gene expression, DNA methylation, and histone modifications. A priori mapping of existing DNA binding motifs within accessible genes and the cold C-repeat/dehydration responsive element-binding factor pathway resulted in unique motif mapping patterns. In summary, a collection of endodermal and epidermal cold acclimation induced chromatin accessibility sites may be used to understand mechanisms of gene expression and to best design synthetic promoters

    Analysis, Visualization, and Machine Learning of Epigenomic Data

    Get PDF
    The goal of the Encyclopedia of DNA Elements (ENCODE) project has been to characterize all the functional elements of the human genome. These elements include expressed transcripts and genomic regions bound by transcription factors (TFs), occupied by nucleosomes, occupied by nucleosomes with modified histones, or hypersensitive to DNase I cleavage, etc. Chromatin Immunoprecipitation (ChIP-seq) is an experimental technique for detecting TF binding in living cells, and the genomic regions bound by TFs are called ChIP-seq peaks. ENCODE has performed and compiled results from tens of thousands of experiments, including ChIP-seq, DNase, RNA-seq and Hi-C. These efforts have culminated in two web-based resources from our lab—Factorbook and SCREEN—for the exploration of epigenomic data for both human and mouse. Factorbook is a peak-centric resource presenting data such as motif enrichment and histone modification profiles for transcription factor binding sites computed from ENCODE ChIP-seq data. SCREEN provides an encyclopedia of ~2 million regulatory elements, including promoters and enhancers, identified using ENCODE ChIP-seq and DNase data, with an extensive UI for searching and visualization. While we have successfully utilized the thousands of available ENCODE ChIP-seq experiments to build the Encyclopedia and visualizers, we have also struggled with the practical and theoretical inability to assay every possible experiment on every possible biosample under every conceivable biological scenario. We have used machine learning techniques to predict TF binding sites and enhancers location, and demonstrate machine learning is critical to help decipher functional regions of the genome

    A Bayesian method to incorporate hundreds of functional characteristics with association evidence to improve variant prioritization

    Get PDF
    The increasing quantity and quality of functional genomic information motivate the assessment and integration of these data with association data, including data originating from genome-wide association studies (GWAS). We used previously described GWAS signals ("hits") to train a regularized logistic model in order to predict SNP causality on the basis of a large multivariate functional dataset. We show how this model can be used to derive Bayes factors for integrating functional and association data into a combined Bayesian analysis. Functional characteristics were obtained from the Encyclopedia of DNA Elements (ENCODE), from published expression quantitative trait loci (eQTL), and from other sources of genome-wide characteristics. We trained the model using all GWAS signals combined, and also using phenotype specific signals for autoimmune, brain-related, cancer, and cardiovascular disorders. The non-phenotype specific and the autoimmune GWAS signals gave the most reliable results. We found SNPs with higher probabilities of causality from functional characteristics showed an enrichment of more significant p-values compared to all GWAS SNPs in three large GWAS studies of complex traits. We investigated the ability of our Bayesian method to improve the identification of true causal signals in a psoriasis GWAS dataset and found that combining functional data with association data improves the ability to prioritise novel hits. We used the predictions from the penalized logistic regression model to calculate Bayes factors relating to functional characteristics and supply these online alongside resources to integrate these data with association data

    Identification of Long-Range Regulatory Elements in the Human Genome

    Get PDF
    Genome-wide association studies have shown that the majority of disease-associated genetic variants lie within non-coding regions of the human genome. Subsequently, a challenge following these discoveries is to identify how these variants modulate the risk of disease. Enhancers are non-coding regulatory elements that can be bound by proteins to activate the expression of a gene that may be linearly distant. Experimentally probing all possible enhancer–target gene pairs can be laborious. Hi-C, a technique developed by Job Dekker’s group in 2009, combines high-throughput sequencing with chromosome conformation capture to detect DNA interactions genome-wide and thereby reveals the three-dimensional architecture of chromatin in the nucleus. However, the utility of the datasets produced by this technique for discovering long-range regulatory interactions is largely unexplored. In this thesis, we develop novel approaches to identify DNA-interacting units and their interactions in Hi-C datasets with the goal of uncovering all enhancer–target gene interactions. We began by identifying significantly interacting regions in these datasets, subsequently focusing on candidate enhancer–gene pairs. We found that the identified putative enhancers are enriched for p300 binding activity, while their target promoters are likely to be cell-type-specific. Furthermore, we revealed that enhancers and target genes often interact in many-to-many relationships and the majority of enhancer–target gene interactions are intra-chromosomal and within 1 Mb of each other. Next, we refined our analytical approach to identify physically-interacting DNA regions at ~1 kb resolution and better define the boundaries of likely enhancer elements. By searching for over-represented sequences (motifs) in these putative promoter-interacting enhancers, we were then able to identify bound transcription factors. This newer approach provides the potential to identify protein complexes involved in enhancer–promoter interactions, which can be verified in future experiments. We implemented a high-throughput identification pipeline for promoter-interacting enhancer elements (HIPPIE) using both of the above described approaches. HIPPIE can be run efficiently on typical Linux servers and grid computing environments and is available as open-source software. In summary, our findings demonstrate the potential utility of Hi-C technologies for elucidating the mechanisms by which long-range enhancers regulate gene expression and ultimately result in human disease phenotypes

    Role of growth hormone and chromatin structure in regulation of sex differences in mouse liver gene expression

    Full text link
    Sex differences in mammalian gene expression result from differences in genotypic sex as well as in hormonal regulators between males and females. In rat, mouse and human liver, ~1000 genes are expressed in a sex-dependent manner, and contribute to sex differences in metabolism of drugs, steroids and lipids, and in liver and cardiovascular disease risk. In rats and mice, sex-biased liver gene expression is primarily dictated by the sexually dimorphic pattern of pituitary growth hormone (GH) release and its STAT5-dependent transcriptional activities. Studies presented in this thesis include the following. (1) A computational approach based on DNA sequence and phylogenetic conservation was developed and used to identify novel functional STAT5 binding sites - both consensus and non-consensus STAT5 sequences - near prototypic GH-responsive genes. (2) Global gene expression analysis of livers from pituitary-ablated male and female mice identified four major classes of sex-biased genes differing in their profiles of GH dependence. (3) Sex-differences in DNase-hypersensitive sites (DHS, corresponding to open chromatin regions) were identified genome-wide in mouse liver. These sex-differential DHSs were enriched for association with sex-biased genes, but a majority was distant from sex-biased genes. Furthermore, many were responsive to GH treatment, demonstrating that GH-mediated regulation involves chromatin remodeling. Analysis of sequence motifs enriched at sex-biased DHSs implicated STAT5 and novel transcription factors such as PBX1 and TAL1 in sex-biased gene regulation. (4) Genome-wide mapping of histone modifications revealed distinct mechanisms of sex-biased gene regulation in male and female liver: sex-dependent K27me3-mediated repression is an important mechanism of repression of female-biased, but not of male-biased, genes, and a sex-dependent K4me1 distribution, suggesting nucleosome repositioning by pioneer factors, is observed at male-biased, but not female-biased, regulatory sites. STAT5-mediated activation was most strongly associated with sex-biased chromatin modifications, while BCL6-mediated repression primarily occurs in association with sex-independent chromatin modifications, both at binding sites and at target genes. The relationships between sex-dependent chromatin accessibility, chromatin modifications and transcription-factor binding uncovered by these studies help elucidate the molecular mechanisms governing sex-differential gene expression, and underscore the utility of functional genomic and epigenetic studies as tools for elucidating transcriptional regulation in complex mammalian systems

    A User's Guide to the Encyclopedia of DNA Elements (ENCODE)

    Get PDF
    The mission of the Encyclopedia of DNA Elements (ENCODE) Project is to enable the scientific and medical communities to interpret the human genome sequence and apply it to understand human biology and improve health. The ENCODE Consortium is integrating multiple technologies and approaches in a collective effort to discover and define the functional elements encoded in the human genome, including genes, transcripts, and transcriptional regulatory regions, together with their attendant chromatin states and DNA methylation patterns. In the process, standards to ensure high-quality data have been implemented, and novel algorithms have been developed to facilitate analysis. Data and derived results are made available through a freely accessible database. Here we provide an overview of the project and the resources it is generating and illustrate the application of ENCODE data to interpret the human genome

    The role of chromatin accessibility in directing the widespread, overlapping patterns of Drosophila transcription factor binding

    Get PDF
    Abstract Background In Drosophila embryos, many biochemically and functionally unrelated transcription factors bind quantitatively to highly overlapping sets of genomic regions, with much of the lowest levels of binding being incidental, non-functional interactions on DNA. The primary biochemical mechanisms that drive these genome-wide occupancy patterns have yet to be established. Results Here we use data resulting from the DNaseI digestion of isolated embryo nuclei to provide a biophysical measure of the degree to which proteins can access different regions of the genome. We show that the in vivo binding patterns of 21 developmental regulators are quantitatively correlated with DNA accessibility in chromatin. Furthermore, we find that levels of factor occupancy in vivo correlate much more with the degree of chromatin accessibility than with occupancy predicted from in vitro affinity measurements using purified protein and naked DNA. Within accessible regions, however, the intrinsic affinity of the factor for DNA does play a role in determining net occupancy, with even weak affinity recognition sites contributing. Finally, we show that programmed changes in chromatin accessibility between different developmental stages correlate with quantitative alterations in factor binding. Conclusions Based on these and other results, we propose a general mechanism to explain the widespread, overlapping DNA binding by animal transcription factors. In this view, transcription factors are expressed at sufficiently high concentrations in cells such that they can occupy their recognition sequences in highly accessible chromatin without the aid of physical cooperative interactions with other proteins, leading to highly overlapping, graded binding of unrelated factors

    The Determinants of Nucleosome Patterns and the Impact of Phosphate Starvation on Nucleosome Patterns and Gene Expression in Rice

    Get PDF
    In eukaryotic cells, DNA is a large molecule that must be greatly condensed to fit within the nucleus. DNA is wrapped around histone proteins to form nucleosomes, which facilitate DNA condensation, but on the other hand, may limit DNA processes. Organisms must respond to environmental stress in order to survive, and one strategy is by remodeling nucleosomes to promote changes in DNA accessibility to alter gene expression. Studies have demonstrated a clear correlation between nucleosome dynamics and transcriptional change in some eukaryotes, however factors that affect nucleosome positioning in plants are largely unknown, and the correlation between nucleosome dynamics and transcriptional changes in response to environmental perturbation remain unclear. We report a high-resolution map of nucleosome patterns in the rice (Oryza sativa) genome by deep sequencing of micrococcal nuclease digested chromatin. The results reveal that nucleosome patterns at rice genes were affected by both cis- and trans- determinants, including GC content and transcription. A negative correlation between nucleosome occupancy across the transcription start site (TSS) and transcription was observed, and the nucleosome patterns across the TSS were correlated with distinct functional categories of genes. A parallel experiment was done monitoring nucleosome dynamics and transcription changes in response to phosphate starvation for 24 hours. Phosphate starvation resulted in numerous instances of nucleosome dynamics across the genome which were enhanced at differentially expressed genes. This work demonstrates that rice nucleosome patterns are suggestive of gene functions, and reveal a link between chromatin remodeling and transcriptional changes in response to deficiency of a major macronutrient. The findings help to enhance the understanding towards eukaryotic gene regulation at the chromatin level
    • …
    corecore