50 research outputs found
Comprehensive analysis of epigenetic signatures of human transcription control
Advances in sequencing technologies have enabled exploration of epigenetic and transcriptional profiles at a genome-wide level. The epigenetic and transcriptional landscapes are now available in hundreds of mammalian cell and tissue contexts. Many studies have performed multi-omics analyses using these datasets to enhance our understanding of relationships between epigenetic modifications and transcription regulation. Nevertheless, most studies so far have focused on the promoters/enhancers and transcription start sites, and other features of transcription control including exons, introns and transcription termination remain underexplored. We investigated the interplay between epigenetic modifications and diverse transcription features using the data generated by the Roadmap Epigenomics project. A comprehensive analysis of histone modifications, DNA methylation, and RNA-seq data of thirty-three human cell lines and tissue types allowed us to confirm the generality of previously described relationships, as well as to generate new hypotheses about the interplay between epigenetic modifications and transcription features. Importantly, our analysis included previously under-explored features of transcription control, namely, transcription termination sites, exonâintron boundaries, and the exon inclusion ratio. We have made the analyses freely available to the scientific community at joshiapps.cbu.uib.no/perepigenomics_app/ for easy exploration, validation and hypothesis generation.publishedVersio
CpG island erosion, polycomb occupancy and sequence motif enrichment at bivalent promoters in mammalian embryonic stem cells
In embryonic stem (ES) cells, developmental regulators have a characteristic bivalent chromatin signature marked by simultaneous presence of both activation (H3K4me3) and repression (H3K27me3) signals and are thought to be in a 'poised' state for subsequent activation or silencing during differentiation. We collected eleven pairs (H3K4me3 and H3K27me3) of ChIP sequencing datasets in human ES cells and eight pairs in murine ES cells, and predicted high-confidence (HC) bivalent promoters. Over 85% of H3K27me3 marked promoters were bivalent in human and mouse ES cells. We found that (i) HC bivalent promoters were enriched for developmental factors and were highly likely to be differentially expressed upon transcription factor perturbation; (ii) murine HC bivalent promoters were occupied by both polycomb repressive component classes (PRC1 and PRC2) and grouped into four distinct clusters with different biological functions; (iii) HC bivalent and active promoters were CpG rich while H3K27me3-only promoters lacked CpG islands. Binding enrichment of distinct sets of regulators distinguished bivalent from active promoters. Moreover, a 'TCCCC' sequence motif was specifically enriched in bivalent promoters. Finally, this analysis will serve as a resource for future studies to further understand transcriptional regulation during embryonic development
Heat*seq:an interactive web tool for high-throughput sequencing experiment comparison with public data
Better protocols and decreasing costs have made high-throughput sequencing experiments now accessible even to small experimental laboratories. However, comparing one or few experiments generated by an individual lab to the vast amount of relevant data freely available in the public domain might be limited due to lack of bioinformatics expertise. Though several tools, including genome browsers, allow such comparison at a single gene level, they do not provide a genome-wide view. We developed Heat*seq, a web-tool that allows genome scale comparison of high throughput experiments chromatin immuno-precipitation followed by sequencing, RNA-sequencing and Cap Analysis of Gene Expression) provided by a user, to the data in the public domain. Heat*seq currently contains over 12Â 000 experiments across diverse tissues and cell types in human, mouse and drosophila. Heat*seq displays interactive correlation heatmaps, with an ability to dynamically subset datasets to contextualize user experiments. High quality figures and tables are produced and can be downloaded in multiple formats
Causal Transcription Regulatory Network Inference Using Enhancer Activity as a Causal Anchor
Transcription control plays a crucial role in establishing a unique gene expression signature
for each of the hundreds of mammalian cell types. Though gene expression data has been widely used
to infer the cellular regulatory networks, the methods mainly infer correlations rather than causality. We
propose that a causal inference framework successfully used for eQTL data can be extended to infer
causal regulatory networks using enhancers as causal anchors and enhancer RNA expression as a
readout of enhancer activity. We developed statistical models and likelihood-ratio tests to infer causal gene regulatory networks using enhancer RNA (eRNA) expression information as a causal anchor and applied the framework to eRNA and transcript expression data from the FANTOM consortium. Predicted causal
targets of transcription factors (TFs) in mouse embryonic stem cells, macrophages and erythroblastic
leukemia overlapped significantly with experimentally validated targets from ChIP-seq and perturbation
data. We further improved the model by taking into account that some TFs might act in a quantitative,
dosage-dependent manner, whereas others might act predominantly in a binary on/off fashion. We
predicted TF targets from concerted variation of eRNA and TF and target promoter expression levels
within a single cell type as well as across multiple cell types. Importantly, TFs with high-confidence
predictions were largely different between these two analyses, demonstrating that variability within a cell
type is highly relevant for target prediction of cell type specific factors. Finally, we generated a compendium
of high-confidence TF targets across diverse human cell and tissue types
Insights into mammalian transcription control by systematic analysis of ChIP sequencing data
Abstract Background Transcription regulation is a major controller of gene expression dynamics during development and disease, where transcription factors (TFs) modulate expression of genes through direct or indirect DNA interaction. ChIP sequencing has become the most widely used technique to get a genome wide view of TF occupancy in a cell type of interest, mainly due to established standard protocols and a rapid decrease in the cost of sequencing. The number of available ChIP sequencing data sets in public domain is therefore ever increasing, including data generated by individual labs together with consortia such as the ENCODE project. Results A total of 1735 ChIP-sequencing datasets in mouse and human cell types and tissues were used to perform bioinformatic analyses to unravel diverse features of transcription control. 1- We used the Heat*seq webtool to investigate global relations across the ChIP-seq samples. 2- We demonstrated that factors have a specific genomic location preferences that are, for most factors, conserved across species. 3- Promoter proximal binding of factors was more conserved across cell types while the distal binding sites are more cell type specific. 4- We identified combinations of factors preferentially acting together in a cellular context. 5- Finally, by integrating the data with disease-associated gene loci from GWAS studies, we highlight the value of this data to associate novel regulators to disease. Conclusion In summary, we demonstrate how ChIP sequencing data integration and analysis is powerful to get new insights into mammalian transcription control and demonstrate the utility of various bioinformatic tools to generate novel testable hypothesis using this public resource
Variable reproducibility in genome-scale public data:A case study using ENCODE ChIP sequencing resource
Genome-wide data is accumulating in an unprecedented way in the public domain. Re-mining this data shows great potential to generate novel hypotheses. However this approach is dependent on the quality (technical and biological) of the underlying data. Here we performed a systematic analysis of chromatin immunoprecipitation (ChIP) sequencing data of transcription and epigenetic factors from the encyclopaedia of DNA elements (ENCODE) resource to demonstrate that about one third of conditions with replicates show low concordance between replicate peak lists. This serves as a case study to demonstrate a caveat concerning genome-wide analyses and highlights a need to validate the quality of each sample before performing further associative analyses
Repression of PLA2R1 by c-MYC and HIF-2alpha promotes cancer growth
Loss of secreted phospholipase A2 receptor (PLA2R1) has recently been found to render human primary cells more resistant to senescence whereas increased PLA2R1 expression is able to induce cell cycle arrest, cancer cell death or blockage of cancer cell transformation in vitro, suggesting that PLA2R1 displays tumor suppressive activities. Here we report that PLA2R1 expression strongly decreases in samples of human renal cell carcinoma (RCC). Knockdown of PLA2R1 increases renal cancer cell tumorigenicity supporting a role of PLA2R1 loss to promote in vivo RCC growth. Most RCC result from Von Hippel-Lindau (VHL) tumor suppressor loss-of-function and subsequent gain-of-function of the oncogenic HIF-2alpha/c-MYC pathway. Here, by genetically manipulating VHL, HIF-2alpha and c-MYC, we demonstrate that loss of VHL, stabilization of HIF-2alpha and subsequent increased c-MYC activity, binding and transcriptional repression, through induction of PLA2R1 DNA methylation closed to PLA2R1 transcriptional start site, results in decreased PLA2R1 transcription. Our results describe for the first time an oncogenic pathway leading to PLA2R1 transcriptional repression and the importance of this repression for tumor growth