13 research outputs found

    Retrotransposons are specified as DNA replication origins in the gene-poor regions of Arabidopsis heterochromatin

    Get PDF
    Genomic stability depends on faithful genome replication. This is achieved by the concerted activity of thousands of DNA replication origins (ORIs) scattered throughout the genome. The DNA and chromatin features determining ORI specification are not presently known. We have generated a high-resolution genome-wide map of 3230 ORIs in cultured Arabidopsis thaliana cells. Here, we focused on defining the features associated with ORIs in heterochromatin. In pericentromeric gene-poor domains ORIs associate almost exclusively with the retrotransposon class of transposable elements (TEs), in particular of the Gypsy family. ORI activity in retrotransposons occurs independently of TE expression and while maintaining high levels of H3K9me2 and H3K27me1, typical marks of repressed heterochromatin. ORI-TEs largely colocalize with chromatin signatures defining GC-rich heterochromatin. Importantly, TEs with active ORIs contain a local GC content higher than the TEs lacking them. Our results lead us to conclude that ORI colocalization with retrotransposons is determined by their transposition mechanism based on transcription, and a specific chromatin landscape. Our detailed analysis of ORIs responsible for heterochromatin replication has implications on the mechanisms of ORI specification in other multicellular organisms in which retrotransposons are major components of heterochromatin and of the entire genome

    histoneHMM:Differential analysis of histone modifications with broad genomic footprints

    Get PDF
    BACKGROUND: ChIP-seq has become a routine method for interrogating the genome-wide distribution of various histone modifications. An important experimental goal is to compare the ChIP-seq profiles between an experimental sample and a reference sample, and to identify regions that show differential enrichment. However, comparative analysis of samples remains challenging for histone modifications with broad domains, such as heterochromatin-associated H3K27me3, as most ChIP-seq algorithms are designed to detect well defined peak-like features. RESULTS: To address this limitation we introduce histoneHMM, a powerful bivariate Hidden Markov Model for the differential analysis of histone modifications with broad genomic footprints. histoneHMM aggregates short-reads over larger regions and takes the resulting bivariate read counts as inputs for an unsupervised classification procedure, requiring no further tuning parameters. histoneHMM outputs probabilistic classifications of genomic regions as being either modified in both samples, unmodified in both samples or differentially modified between samples. We extensively tested histoneHMM in the context of two broad repressive marks, H3K27me3 and H3K9me3, and evaluated region calls with follow up qPCR as well as RNA-seq data. Our results show that histoneHMM outperforms competing methods in detecting functionally relevant differentially modified regions. CONCLUSION: histoneHMM is a fast algorithm written in C++ and compiled as an R package. It runs in the popular R computing environment and thus seamlessly integrates with the extensive bioinformatic tool sets available through Bioconductor. This makeshistoneHMM an attractive choice for the differential analysis of ChIP-seq data. Software is available from http://histonehmm.molgen.mpg.de

    CHIP_QC, COMPUTATIONAL PLATFORM FOR MULTIVARIATE EPIGENETIC STUDIES AND ITS APPLICATION IN UNCOVERING ROLE OF POLYCOMB DEPENDENT METHYLATIONS STATES

    Get PDF
    ABSTRACT During my PhD tenure, I have been involved in developing a user-friendly cross-platform system capable of analyzing epigenomic data and further use it in understanding the role of the Polycomb Repressive Complex 2 (PRC2) in genome regulation. From current trending in epigenetics research, we can sense increasing ease of high throughput sequencing and greater interest towards genome wide epigenomic studies. As a result of which we experience an exponential flooding of epigenetic related data such as Chromatin immunoprecipitation followed by sequencing (ChIP-seq), and RNA sequencing (RNA-seq) in public domain. This creates an opportunity for crowd sourcing and exploring data outside the boundaries of specific query centered studies. Such data has to undergo standard primary analysis, which with the aid of multiple programs has been stabilized courtesy to the scientific community. Further downstream, out of many, genome wide comparative, correlative and quantitative studies have proven to be critical and helpful in deciphering key biological features. For such studies we lack platforms, which can be capable of handling, analyzing and linking multiple interdisciplinary (ChIP-seq/RNA-seq) datasets with efficient analytical methods. With this aim we developed ChIP_QC, a user-friendly standalone computational program with an ability to support numerous datasets with high/moderate sequencing depth for performing genome wide analysis. First, using ENCODE project (Consortium, 2012) data, we illustrated few applications of the program by posing different biological scenarios and showed the comfort with which some known observations can be verified and also how it can be helpful in deducing some other novel observations. Second, we were interested in understanding the functionality of the products generated through catalytic activity of PRC2. It is known that Lysine 27 of histone H3 (H3K27) undergoes posttranslational modification (PTM) and methylation is one such dominant PTM. Methylation on H3K27 can be either mono/di/tri-methylation form. Out of all three forms, it is very well demonstrated that trimethylation of H3K27 (H3K27me3) is PRC2 dependent and at the same time its role in gene repression is well characterized, but functional roles of other forms of methylation on H3K27 are still poorly characterized. For understanding this, we used mouse embryonic stem cells (mESC) as model system of our study and we were able to provide an extensive characterization of other forms of methylation, highlighting their differential deposition along the genome, their fundamental role in transcriptional regulation, and their indispensability during differentiation program. Using ChIP_QC and with other computational methods along with experimental evidences, our data demonstrated that the monomethylation of Lys27 (H3K27me1) is required for correct transcription of genes and positively correlates with trimethylated Lys36 (H3K36me3); on the other hand dimethylated Lys27 (H3K27me2), that we identified to be the principal activity of PRC2, prevents firing of non cell type specific enhancers

    Développement d'une librairie de code et d'outils bio-informatiques faciliant l'analyse de grandes quantités de données génomiques

    Get PDF
    Thèse décrivant l'écriture d'outils spécialisés facilitant l'analyse de grandes quantités de données provenant de technologie de séquencage haut débit
    corecore