20 research outputs found

    Dynamics of bivalent chromatin during development in mammals

    Get PDF
    Mammalian cell types and tissues have diverse functional roles within an organism but can be derived by the differentiation of the embryonic stem cells (ESCs). ESCs are pluripotent cells with self-renewal properties. During development subsets of genes in ESCs are activated or silenced for manifestation of the cell type specific function. Gene expression changes occur transiently in early developmental stages, through signals received and executed by a variety of transcription factors (TFs), regulatory elements (promoters, enhancers) and epigenetic modifications of chromatin. Post-translational modifications of the histone tails are regulated by chromatin modifiers and transform the chromatin architecture. Polycomb (PcG) and Trithorax (TrxG) group proteins are the most commonly studied histone modifiers. They were first discovered as repressors (H3K27me3) and activators (H3K4me3) respectively of Homeobox (Hox) genes in Drosophila and they are conserved in mammals. Bivalent chromatin is defined as the simultaneous presence of silencing (H3K27me3) and activating (H3K4me3) histone marks and was first discovered as a feature of many developmental gene promoters of ESCs. Bivalent promoters are thought to be in a ‘poised’ state for later activation or repression during differentiation due to the presence of the two counter-acting histone modifications and a pausing variant of RNA polymerase II (RNAPII) accompanied with intermediate-low levels of expression. By integrative analysis of publicly available ChIP sequencing (ChIP-seq) datasets in murine and human ESCs, we predicted 3,659 and 4,979 high–confidence (HC) bivalent promoters in mouse and human ESCs respectively. Using a peak-based method, we acquire a set of bivalent promoters with high enrichment for developmental regulators. Over 85% of Polycomb targets were bivalent and their expression was particularly sensitive to TF perturbation. Moreover, murine HC bivalent promoters were occupied by both Polycomb repressive component classes (PRC1 and PRC2) and grouped into four distinct clusters with different biological functions. HC bivalent and active promoters were CpG rich while H3K27me3-only promoters lacked CpG islands. Binding enrichment of distinct sets of regulators distinguished bivalent from active promoters and a ‘TCCCC’ sequence motif was specifically enriched in bivalent promoters. Using the recent technology of single cell RNA sequencing (scRNA-seq) we focused on gene expression heterogeneity and how it may affect the output of differentiation. We collected single cell gene expression profiles for 32 human and 39 murine ESCs and studied the correlation between diverse characteristics such as network connectivity and coefficient of variation (CV) across single cells. We further characterized properties unique to genes with high CV. Highly expressed genes tended to have a low CV and were enriched for cell cycle genes. In contrast, High CV genes were co-expressed with other High CV genes, were enriched for bivalent promoters and showed enrichment for response to DNA damage and DNA repair. Bivalent promoters in ESCs grouped in four distinct classes of variable biological functions according to Polycomb occupancy and three RNAPII variants. To study the dynamics of epigenetic and transcription control at promoters during development, we collected ChIPseq data for two chromatin modifications (H3K4me3 and H3K27me3) and RNAPII (8WG16 antibody) as well as expression data (RNA-seq) across 8 cell types (ESCs and seven committed cell types) in mouse. Hierarchical clustering of 22,179 unique gene promoters across cell types, showed that H3K4me3 peaks are in agreement with the expression data while H3K27me3 and RNAPII peaks were not highly consistent with the hierarchical tree of gene expression. Unsupervised clustering of ChIP-seq and RNA-seq profiles has resulted in 31 distinct profiles, which were subsequently narrowed down to nine major profile groups across cell types. TF enrichment at individual clusters using ChIP sequencing data did not fully agree with the classification of 8 major profile groups. Considering all the above results, three major epigenetic profiles (active, bivalent and latent) seem to be conserved across the species and cell types in our study. These states could recapitulate only a fraction of the transcriptional information - adding other chromatin marks could enrich it - since they are seemingly unaffected by their respective expression profiles. H3K27me3 only state has low CpG density and shows stronger signatures at differentiated cell types. Transcriptional control is tighter in active than bivalent promoters and the different occupancy levels of PcG subunits and RNAPII can be reflected at the expression variance of bivalent genes, where a fraction of them are involved in developmental functions while others are more tissue-specific. Last, there is a striking similarity in the pausing patterns of RNAPII in the progenitor cell types, which suggests that RNAPII pausing is correlated with the developmental potential of the cell type. Finally, this analysis will serve as a resource for future studies to further understand transcriptional regulation during development

    CpG island erosion, polycomb occupancy and sequence motif enrichment at bivalent promoters in mammalian embryonic stem cells

    Get PDF
    In embryonic stem (ES) cells, developmental regulators have a characteristic bivalent chromatin signature marked by simultaneous presence of both activation (H3K4me3) and repression (H3K27me3) signals and are thought to be in a 'poised' state for subsequent activation or silencing during differentiation. We collected eleven pairs (H3K4me3 and H3K27me3) of ChIP sequencing datasets in human ES cells and eight pairs in murine ES cells, and predicted high-confidence (HC) bivalent promoters. Over 85% of H3K27me3 marked promoters were bivalent in human and mouse ES cells. We found that (i) HC bivalent promoters were enriched for developmental factors and were highly likely to be differentially expressed upon transcription factor perturbation; (ii) murine HC bivalent promoters were occupied by both polycomb repressive component classes (PRC1 and PRC2) and grouped into four distinct clusters with different biological functions; (iii) HC bivalent and active promoters were CpG rich while H3K27me3-only promoters lacked CpG islands. Binding enrichment of distinct sets of regulators distinguished bivalent from active promoters. Moreover, a 'TCCCC' sequence motif was specifically enriched in bivalent promoters. Finally, this analysis will serve as a resource for future studies to further understand transcriptional regulation during embryonic development

    Heat*seq:an interactive web tool for high-throughput sequencing experiment comparison with public data

    Get PDF
    Better protocols and decreasing costs have made high-throughput sequencing experiments now accessible even to small experimental laboratories. However, comparing one or few experiments generated by an individual lab to the vast amount of relevant data freely available in the public domain might be limited due to lack of bioinformatics expertise. Though several tools, including genome browsers, allow such comparison at a single gene level, they do not provide a genome-wide view. We developed Heat*seq, a web-tool that allows genome scale comparison of high throughput experiments chromatin immuno-precipitation followed by sequencing, RNA-sequencing and Cap Analysis of Gene Expression) provided by a user, to the data in the public domain. Heat*seq currently contains over 12 000 experiments across diverse tissues and cell types in human, mouse and drosophila. Heat*seq displays interactive correlation heatmaps, with an ability to dynamically subset datasets to contextualize user experiments. High quality figures and tables are produced and can be downloaded in multiple formats

    Gene expression variability in mammalian embryonic stem cells using single cell RNA-seq data

    Get PDF
    AbstractBackgroundGene expression heterogeneity contributes to development as well as disease progression. Due to technological limitations, most studies to date have focused on differences in mean expression across experimental conditions, rather than differences in gene expression variance. The advent of single cell RNA sequencing has now made it feasible to study gene expression heterogeneity and to characterise genes based on their coefficient of variation.MethodsWe collected single cell gene expression profiles for 32 human and 39 mouse embryonic stem cells and studied correlation between diverse characteristics such as network connectivity and coefficient of variation (CV) across single cells. We further systematically characterised properties unique to High CV genes.ResultsHighly expressed genes tended to have a low CV and were enriched for cell cycle genes. In contrast, High CV genes were co-expressed with other High CV genes, were enriched for bivalent (H3K4me3 and H3K27me3) marked promoters and showed enrichment for response to DNA damage and DNA repair.ConclusionsTaken together, this analysis demonstrates the divergent characteristics of genes based on their CV. High CV genes tend to form co-expression clusters and they explain bivalency at least in part

    Variable reproducibility in genome-scale public data:A case study using ENCODE ChIP sequencing resource

    Get PDF
    Genome-wide data is accumulating in an unprecedented way in the public domain. Re-mining this data shows great potential to generate novel hypotheses. However this approach is dependent on the quality (technical and biological) of the underlying data. Here we performed a systematic analysis of chromatin immunoprecipitation (ChIP) sequencing data of transcription and epigenetic factors from the encyclopaedia of DNA elements (ENCODE) resource to demonstrate that about one third of conditions with replicates show low concordance between replicate peak lists. This serves as a case study to demonstrate a caveat concerning genome-wide analyses and highlights a need to validate the quality of each sample before performing further associative analyses

    Dynamics of promoter bivalency and RNAP II pausing in mouse stem and differentiated cells

    Get PDF
    Mammalian embryonic stem cells display a unique epigenetic and transcriptional state to facilitate pluripotency by maintaining lineage-specification genes in a poised state. Two epigenetic and transcription processes involved in maintaining poised state are bivalent chromatin, characterized by the simultaneous presence of activating and repressive histone methylation marks, and RNA polymerase II (RNAPII) promoter proximal pausing. However, the dynamics of histone modifications and RNAPII at promoters in diverse cellular contexts remains underexplored. We collected genome wide data for bivalent chromatin marks H3K4me3 and H3K27me3, and RNAPII (8WG16) occupancy together with expression profiling in eight different cell types, including ESCs, in mouse. The epigenetic and transcription profiles at promoters grouped in over thirty clusters with distinct functional identities and transcription control. The clustering analysis identified distinct bivalent clusters where genes in one cluster retained bivalency across cell types while in the other were mostly cell type specific, but neither showed a high RNAPII pausing. We noted that RNAPII pausing is more associated with active genes than bivalent genes in a cell type, and was globally reduced in differentiated cell types compared to multipotent

    Investigating resistance in clinical Mycobacterium tuberculosis complex isolates with genomic and phenotypic antimicrobial susceptibility testing: a multicentre observational study.

    Get PDF
    BACKGROUND: Whole-genome sequencing (WGS) of Mycobacterium tuberculosis complex has become an important tool in diagnosis and management of drug-resistant tuberculosis. However, data correlating resistance genotype with quantitative phenotypic antimicrobial susceptibility testing (AST) are scarce. METHODS: In a prospective multicentre observational study, 900 clinical M tuberculosis complex isolates were collected from adults with drug-resistant tuberculosis in five high-endemic tuberculosis settings around the world (Georgia, Moldova, Peru, South Africa, and Viet Nam) between Dec 5, 2014, and Dec 12, 2017. Minimum inhibitory concentrations (MICs) and resulting binary phenotypic AST results for up to nine antituberculosis drugs were determined and correlated with resistance-conferring mutations identified by WGS. FINDINGS: Considering WHO-endorsed critical concentrations as reference, WGS had high accuracy for prediction of resistance to isoniazid (sensitivity 98·8% [95% CI 98·5-99·0]; specificity 96·6% [95% CI 95·2-97·9]), levofloxacin (sensitivity 94·8% [93·3-97·6]; specificity 97·1% [96·7-97·6]), kanamycin (sensitivity 96·1% [95·4-96·8]; specificity 95·0% [94·4-95·7]), amikacin (sensitivity 97·2% [96·4-98·1]; specificity 98·6% [98·3-98·9]), and capreomycin (sensitivity 93·1% [90·0-96·3]; specificity 98·3% [98·0-98·7]). For rifampicin, pyrazinamide, and ethambutol, the specificity of resistance prediction was suboptimal (64·0% [61·0-67·1], 83·8% [81·0-86·5], and 40·1% [37·4-42·9], respectively). Specificity for rifampicin increased to 83·9% when borderline mutations with MICs overlapping with the critical concentration were excluded. Consequently, we highlighted mutations in M tuberculosis complex isolates that are often falsely identified as susceptible by phenotypic AST, and we identified potential novel resistance-conferring mutations. INTERPRETATION: The combined analysis of mutations and quantitative phenotypes shows the potential of WGS to produce a refined interpretation of resistance, which is needed for individualised therapy, and eventually could allow differential drug dosing. However, variability of MIC data for some M tuberculosis complex isolates carrying identical mutations also reveals limitations of our understanding of the genotype and phenotype relationships (eg, including epistasis and strain genetic background). FUNDING: Bill & Melinda Gates Foundation, German Centre for Infection Research, German Research Foundation, Excellence Cluster Precision Medicine of Inflammation (EXC 2167), and Leibniz ScienceCampus EvoLUNG

    Genome-wide positioning of bivalent mononucleosomes

    Get PDF
    BACKGROUND: Bivalent chromatin refers to overlapping regions containing activating histone H3 Lys4 trimethylation (H3K4me3) and inactivating H3K27me3 marks. Existence of such bivalent marks on the same nucleosome has only recently been suggested. Previous genome-wide efforts to characterize bivalent chromatin have focused primarily on individual marks to define overlapping zones of bivalency rather than mapping positions of truly bivalent mononucleosomes. RESULTS: Here, we developed an efficacious sequential ChIP technique for examining global positioning of individual bivalent nucleosomes. Using next generation sequencing approaches we show that although individual H3K4me3 and H3K27me3 marks overlap in broad zones, bivalent nucleosomes are focally enriched in the vicinity of the transcription start site (TSS). These seem to occupy the H2A.Z nucleosome positions previously described as salt-labile nucleosomes, and are correlated with low gene expression. Although the enrichment profiles of bivalent nucleosomes show a clear dependency on CpG island content, they demonstrate a stark anti-correlation with methylation status. CONCLUSIONS: We show that regional overlap of H3K4me3 and H3K27me3 chromatin tend to be upstream to the TSS, while bivalent nucleosomes with both marks are mainly promoter proximal near the TSS of CpG island-containing genes with poised/low expression. We discuss the implications of the focal enrichment of bivalent nucleosomes around the TSS on the poised chromatin state of promoters in stem cells. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12920-016-0221-6) contains supplementary material, which is available to authorized users
    corecore