21 research outputs found

    DGW: an exploratory data analysis tool for clustering and visualisation of epigenomic marks

    Get PDF
    Background Functional genomic and epigenomic research relies fundamentally on sequencing based methods like ChIP-seq for the detection of DNA-protein interactions. These techniques return large, high dimensional data sets with visually complex structures, such as multi-modal peaks extended over large genomic regions. Current tools for visualisation and data exploration represent and leverage these complex features only to a limited extent. Results We present DGW, an open source software package for simultaneous alignment and clustering of multiple epigenomic marks. DGW uses Dynamic Time Warping to adaptively rescale and align genomic distances which allows to group regions of interest with similar shapes, thereby capturing the structure of epigenomic marks. We demonstrate the effectiveness of the approach in a simulation study and on a real epigenomic data set from the ENCODE project. Conclusions Our results show that DGW automatically recognises and aligns important genomic features such as transcription start sites and splicing sites from histone marks. DGW is available as an open source Python package

    Genome-Wide Distribution of RNA-DNA Hybrids Identifies RNase H Targets in tRNA Genes, Retrotransposons and Mitochondria

    Get PDF
    During transcription, the nascent RNA can invade the DNA template, forming extended RNA-DNA duplexes (R-loops). Here we employ ChIP-seq in strains expressing or lacking RNase H to map targets of RNase H activity throughout the budding yeast genome. In wild-type strains, R-loops were readily detected over the 35S rDNA region, transcribed by Pol I, and over the 5S rDNA, transcribed by Pol III. In strains lacking RNase H activity, R-loops were elevated over other Pol III genes, notably tRNAs, SCR1 and U6 snRNA, and were also associated with the cDNAs of endogenous TY1 retrotransposons, which showed increased rates of mobility to the 5'-flanking regions of tRNA genes. Unexpectedly, R-loops were also associated with mitochondrial genes in the absence of RNase H1, but not of RNase H2. Finally, R-loops were detected on actively transcribed protein-coding genes in the wild-type, particularly over the second exon of spliced ribosomal protein genes

    Genome-wide identification of zero nucleotide recursive splicing in Drosophila

    No full text
    Recursive splicing is a process in which large introns are removed in multiple steps by resplicing at ratchet points - 5′ splice sites recreated after splicing(1). Recursive splicing was first identified in the Drosophila Ultrabithorax (Ubx) gene(1) and only three additional Drosophila genes have since been experimentally shown to undergo recursive splicing(2,3). Here, we identify 197 zero nucleotide exon ratchet points in 130 introns of 115 Drosophila genes from total RNA sequencing data generated from developmental time points, dissected tissues, and cultured cells. The sequential nature of recursive splicing was confirmed by identification of lariat introns generated by splicing to and from the ratchet points. We also show that recursive splicing is a constitutive process, that depletion of U2AF inhibits recursive splicing, and that the sequence and function of ratchet points are evolutionarily conserved in Drosophila. Finally, we identified four recursively spliced human genes, one of which is also recursively spliced in Drosophila. Together these results indicate that recursive splicing is commonly used in Drosophila, occurs in human and provides insight into the mechanisms by which some large introns are removed
    corecore