9 research outputs found

    Development of computational approaches for the analysis of bisulfite next-generation sequencing data

    Get PDF
    The scientific contribution of this thesis consists of three articles that have been published in Bioinformatics (Oxford Journals) and Nature Methods and the third article being under review at Leukemia (Nature Publishing Group), respectively. The implications of these articles for the field of computational epigenetics and future perspectives of this research area are discussed. The main challenge within the framework of this thesis was the development of a bioinformatics tool for bisulfite sequencing analysis. The article in Bioinformatics presents the bioinformatics tool B-SOLANA for the analysis of DNA methylation data generated by two-base encoding bisulfite sequencing on the SOLiD platform of Life Technologies. Additionally, benchmark analyses revealed that B-SOLANA exhibits a significantly higher sensitivity and specificity compared to other software approaches which were developed at the same time. The review article in Nature Methods summarizes challenges of bisulfite sequencing analysis as they appear on different high-throughput sequencing platforms. Especially primary analyses including the quality control and mapping of raw sequences are discussed. Furthermore, the article debates the effect of sequencing errors and contaminations on inferred DNA methylation levels and recommends the most appropriate way to analyze this type of data. This review is a helpful reference for the analysis of DNA methylation by high-throughput sequencing, a currently rapidly developing research area. The third article, which has been submitted to Leukemia, comprises the analysis of a DNA methylome of the DAUDI cell line at single base resolution. On the genetic level, this endemic Burkitt Lymphoma cell line is characterized by the presence of the hallmark IG-MYC translocation. Recent publications about this cell line suggested a high number of DNA methylation changes. However, until now only array-based studies were published, which have concentrated their focus on loci-specific DNA methylation patterns. We showed that the mechanisms of DNA methylation associated with transcriptional regulation in lymphomas go by far beyond the usually studied promoter methylation. Furthermore, we characterized the DNA methylome of the mitochondria and the Epstein-Barr virus, whereas upregulation of the latter has already been identified in DAUDI before. As the DAUDI cell line is used over decades in many laboratories throughout the world, the obtained methylome data prove valuable as a "reference epigenome" for future studies

    B-SOLANA: an approach for the analysis of two-base encoding bisulfite sequencing data

    Get PDF
    Summary: Bisulfite sequencing, a combination of bisulfite treatment and high-throughput sequencing, has proved to be a valuable method for measuring DNA methylation at single base resolution. Here, we present B-SOLANA, an approach for the analysis of two-base encoding (colorspace) bisulfite sequencing data on the SOLiD platform of Life Technologies. It includes the alignment of bisulfite sequences and the determination of methylation levels in CpG as well as non-CpG sequence contexts. B-SOLANA enables a fast and accurate analysis of large raw sequence datasets

    A tissue-specific landscape of sense/antisense transcription in the mouse intestine

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The intestinal mucosa is characterized by complex metabolic and immunological processes driven highly dynamic gene expression programs. With the advent of next generation sequencing and its utilization for the analysis of the RNA sequence space, the level of detail on the global architecture of the transcriptome reached a new order of magnitude compared to microarrays.</p> <p>Results</p> <p>We report the ultra-deep characterization of the polyadenylated transcriptome in two closely related, yet distinct regions of the mouse intestinal tract (small intestine and colon). We assessed tissue-specific transcriptomal architecture and the presence of novel transcriptionally active regions (nTARs). In the first step, signatures of 20,541 NCBI RefSeq transcripts could be identified in the intestine (74.1% of annotated genes), thereof 16,742 are common in both tissues. Although the majority of reads could be linked to annotated genes, 27,543 nTARs not consistent with current gene annotations in RefSeq or ENSEMBL were identified. By use of a second independent strand-specific RNA-Seq protocol, 20,966 of these nTARs were confirmed, most of them in vicinity of known genes. We further categorized our findings by their relative adjacency to described exonic elements and investigated regional differences of novel transcribed elements in small intestine and colon.</p> <p>Conclusions</p> <p>The current study demonstrates the complexity of an archetypal mammalian intestinal mRNA transcriptome in high resolution and identifies novel transcriptionally active regions at strand-specific, single base resolution. Our analysis for the first time shows a strand-specific comparative picture of nTARs in two tissues and represents a resource for further investigating the transcriptional processes that contribute to tissue identity.</p
    corecore