9,401 research outputs found

    Wellington : a novel method for the accurate identification of digital genomic footprints from DNase-seq data

    Get PDF
    The expression of eukaryotic genes is regulated by cis-regulatory elements such as promoters and enhancers, which bind sequence-specific DNA-binding proteins. One of the great challenges in the gene regulation field is to characterise these elements. This involves the identification of transcription factor (TF) binding sites within regulatory elements that are occupied in a defined regulatory context. Digestion with DNase and the subsequent analysis of regions protected from cleavage (DNase footprinting) has for many years been used to identify specific binding sites occupied by TFs at individual cis-elements with high resolution. This methodology has recently been adapted for high-throughput sequencing (DNase-seq). In this study, we describe an imbalance in the DNA strand-specific alignment information of DNase-seq data surrounding protein–DNA interactions that allows accurate prediction of occupied TF binding sites. Our study introduces a novel algorithm, Wellington, which considers the imbalance in this strand-specific information to efficiently identify DNA footprints. This algorithm significantly enhances specificity by reducing the proportion of false positives and requires significantly fewer predictions than previously reported methods to recapitulate an equal amount of ChIP-seq data. We also provide an open-source software package, pyDNase, which implements the Wellington algorithm to interface with DNase-seq data and expedite analyses

    Chromatin accessibility dynamics in the Arabidopsis root epidermis and endodermis during cold acclimation

    Get PDF
    Understanding cell-type specific transcriptional responses to environmental conditions is limited by a lack of knowledge of transcriptional control due to epigenetic dynamics. Additionally, cell-type analyses are limited by difficulties in applying current technologies to single cell-types. A novel DNase-seq protocol and analysis procedure, deemed DNase-DTS, was developed to identify DHSs in the Arabidopsis epidermis and endodermis under control and cold acclimation conditions. Results identified thousands of DHSs within each cell-type and experimental condition. DHSs showed strong association to gene expression, DNA methylation, and histone modifications. A priori mapping of existing DNA binding motifs within accessible genes and the cold C-repeat/dehydration responsive element-binding factor pathway resulted in unique motif mapping patterns. In summary, a collection of endodermal and epidermal cold acclimation induced chromatin accessibility sites may be used to understand mechanisms of gene expression and to best design synthetic promoters

    Characterization of chromatin accessibility with a transposome hypersensitive sites sequencing (THS-seq) assay.

    Get PDF
    Chromatin accessibility captures in vivo protein-chromosome binding status, and is considered an informative proxy for protein-DNA interactions. DNase I and Tn5 transposase assays require thousands to millions of fresh cells for comprehensive chromatin mapping. Applying Tn5 tagmentation to hundreds of cells results in sparse chromatin maps. We present a transposome hypersensitive sites sequencing assay for highly sensitive characterization of chromatin accessibility. Linear amplification of accessible DNA ends with in vitro transcription, coupled with an engineered Tn5 super-mutant, demonstrates improved sensitivity on limited input materials, and accessibility of small regions near distal enhancers, compared with ATAC-seq

    DeFCoM: analysis and modeling of transcription factor binding sites using a motif-centric genomic footprinter

    Get PDF
    Identifying the locations of transcription factor binding sites is critical for understanding how gene transcription is regulated across different cell types and conditions. Chromatin accessibility experiments such as DNaseI sequencing (DNase-seq) and Assay for Transposase Accessible Chromatin sequencing (ATAC-seq) produce genome-wide data that include distinct "footprint" patterns at binding sites. Nearly all existing computational methods to detect footprints from these data assume that footprint signals are highly homogeneous across footprint sites. Additionally, a comprehensive and systematic comparison of footprinting methods for specifically identifying which motif sites for a specific factor are bound has not been performed. Using DNase-seq data from the ENCODE project, we show that a large degree of previously uncharacterized site-to-site variability exists in footprint signal across motif sites for a transcription factor. To model this heterogeneity in the data, we introduce a novel, supervised learning footprinter called DeFCoM (Detecting Footprints Containing Motifs). We compare DeFCoM to nine existing methods using evaluation sets from four human cell-lines and eighteen transcription factors and show that DeFCoM outperforms current methods in determining bound and unbound motif sites. We also analyze the impact of several biological and technical factors on the quality of footprint predictions to highlight important considerations when conducting footprint analyses and assessing the performance of footprint prediction methods. Lastly, we show that DeFCoM can detect footprints using ATAC-seq data with similar accuracy as when using DNase-seq data. Python code available at https://bitbucket.org/bryancquach/defcom CONTACT: [email protected] or [email protected] SUPPLEMENTARY INFORMATION: Supplementary information available at Bioinformatics online

    Dynamic GATA4 enhancers shape the chromatin landscape central to heart development and disease.

    Get PDF
    How stage-specific enhancer dynamics modulate gene expression patterns essential for organ development, homeostasis and disease is not well understood. Here, we addressed this question by mapping chromatin occupancy of GATA4--a master cardiac transcription factor--in heart development and disease. We find that GATA4 binds and participates in establishing active chromatin regions by stimulating H3K27ac deposition, which facilitates GATA4-driven gene expression. GATA4 chromatin occupancy changes markedly between fetal and adult heart, with a limited binding sites overlap. Cardiac stress restored GATA4 occupancy to a subset of fetal sites, but many stress-associated GATA4 binding sites localized to loci not occupied by GATA4 during normal heart development. Collectively, our data show that dynamic, context-specific transcription factors occupancy underlies stage-specific events in development, homeostasis and disease

    Reproducible inference of transcription factor footprints in ATAC-seq and DNase-seq datasets via protocol-specific bias modeling

    Get PDF
    DNase-seq and ATAC-seq are broadly used methods to assay open chromatin regions genome-wide. The single nucleotide resolution of DNase-seq has been further exploited to infer transcription factor binding sites (TFBS) in regulatory regions via footprinting. Recent studies have demonstrated the sequence bias of DNase I and its adverse effects on footprinting efficiency. However, footprinting and the impact of sequence bias have not been extensively studied for ATAC-seq. Here, we undertake a systematic comparison of the two methods and show that a modification to the ATAC-seq protocol increases its yield and its agreement with DNase-seq data from the same cell line. We demonstrate that the two methods have distinct sequence biases and correct for these protocol-specific biases when performing footprinting. Despite differences in footprint shapes, the locations of the inferred footprints in ATAC-seq and DNase-seq are largely concordant. However, the protocol-specific sequence biases in conjunction with the sequence content of TFBSs impacts the discrimination of footprint from background, which leads to one method outperforming the other for some TFs. Finally, we address the depth required for reproducible identification of open chromatin regions and TF footprints

    CHD7 Targets Active Gene Enhancer Elements to Modulate ES Cell-Specific Gene Expression

    Get PDF
    CHD7 is one of nine members of the chromodomain helicase DNA–binding domain family of ATP–dependent chromatin remodeling enzymes found in mammalian cells. De novo mutation of CHD7 is a major cause of CHARGE syndrome, a genetic condition characterized by multiple congenital anomalies. To gain insights to the function of CHD7, we used the technique of chromatin immunoprecipitation followed by massively parallel DNA sequencing (ChIP–Seq) to map CHD7 sites in mouse ES cells. We identified 10,483 sites on chromatin bound by CHD7 at high confidence. Most of the CHD7 sites show features of gene enhancer elements. Specifically, CHD7 sites are predominantly located distal to transcription start sites, contain high levels of H3K4 mono-methylation, found within open chromatin that is hypersensitive to DNase I digestion, and correlate with ES cell-specific gene expression. Moreover, CHD7 co-localizes with P300, a known enhancer-binding protein and strong predictor of enhancer activity. Correlations with 18 other factors mapped by ChIP–seq in mouse ES cells indicate that CHD7 also co-localizes with ES cell master regulators OCT4, SOX2, and NANOG. Correlations between CHD7 sites and global gene expression profiles obtained from Chd7+/+, Chd7+/−, and Chd7−/− ES cells indicate that CHD7 functions at enhancers as a transcriptional rheostat to modulate, or fine-tune the expression levels of ES–specific genes. CHD7 can modulate genes in either the positive or negative direction, although negative regulation appears to be the more direct effect of CHD7 binding. These data indicate that enhancer-binding proteins can limit gene expression and are not necessarily co-activators. Although ES cells are not likely to be affected in CHARGE syndrome, we propose that enhancer-mediated gene dysregulation contributes to disease pathogenesis and that the critical CHD7 target genes may be subject to positive or negative regulation

    Rapid, solid-phase based automated analysis of chromatin structure and transcription factor occupancy in living eukaryotic cells

    Get PDF
    Transcription factors, chromatin components and chromatin modification activities are involved in many diseases including cancer. However, the means by which alterations in these factors influence the epigenotype of specific cell types is poorly understood. One problem that limits progress is that regulatory regions of eukaryotic genes sometimes extend over large regions of DNA. To improve chromatin structure–function analysis over such large regions, we have developed an automated, relatively simple procedure that uses magnetic beads and a capillary sequencer for ligation-mediated-PCR (LM-PCR). We show that the procedure can be used for the rapid examination of chromatin fine-structure, nucleosome positioning as well as changes in transcription factor binding-site occupancy during cellular differentiation
    corecore