15 research outputs found

    An optimized algorithm for detecting and annotating regional differential methylation

    Get PDF
    Background: DNA methylation profiling reveals important differentially methylated regions (DMRs) of the genome that are altered during development or that are perturbed by disease. To date, few programs exist for regional analysis of enriched or whole-genome bisulfate conversion sequencing data, even though such data are increasingly common. Here, we describe an open-source, optimized method for determining empirically based DMRs (eDMR) from high-throughput sequence data that is applicable to enriched whole-genome methylation profiling datasets, as well as other globally enriched epigenetic modification data. Results: Here we show that our bimodal distribution model and weighted cost function for optimized regional methylation analysis provides accurate boundaries of regions harboring significant epigenetic modifications. Our algorithm takes the spatial distribution of CpGs into account for the enrichment assay, allowing for optimization of the definition of empirical regions for differential methylation. Combined with the dependent adjustment for regional p-value combination and DMR annotation, we provide a method that may be applied to a variety of datasets for rapid DMR analysis. Our method classifies both the directionality of DMRs and their genome-wide distribution, and we have observed that shows clinical relevance through correct stratification of two Acute Myeloid Leukemia (AML) tumor sub-types. Conclusions: Our weighted optimization algorithm eDMR for calling DMRs extends an established DMR R pipeline (methylKit) and provides a needed resource in epigenomics. Our method enables an accurate and scalable way of finding DMRs in high-throughput methylation sequencing experiments. eDMR is available for download at http://code.google.com/p/edmr/.Sheng Li, Francine E Garrett-Bakelman, Altuna Akalin, Paul Zumbo, Ross Levine, Bik L To, Ian D Lewis, Anna L Brown, Richard J D’Andrea, Ari Melnick, Christopher E Maso

    Next-generation sequencing methylation profiling of subjects with obesity identifies novel gene changes

    Get PDF
    abstract: Background Obesity is a metabolic disease caused by environmental and genetic factors. However, the epigenetic mechanisms of obesity are incompletely understood. The aim of our study was to investigate the role of skeletal muscle DNA methylation in combination with transcriptomic changes in obesity. Results Muscle biopsies were obtained basally from lean (n = 12; BMI = 23.4 ± 0.7 kg/m[superscript 2]) and obese (n = 10; BMI = 32.9 ± 0.7 kg/m[superscript 2]) participants in combination with euglycemic-hyperinsulinemic clamps to assess insulin sensitivity. We performed reduced representation bisulfite sequencing (RRBS) next-generation methylation and microarray analyses on DNA and RNA isolated from vastus lateralis muscle biopsies. There were 13,130 differentially methylated cytosines (DMC; uncorrected P < 0.05) that were altered in the promoter and untranslated (5' and 3'UTR) regions in the obese versus lean analysis. Microarray analysis revealed 99 probes that were significantly (corrected P < 0.05) altered. Of these, 12 genes (encompassing 22 methylation sites) demonstrated a negative relationship between gene expression and DNA methylation. Specifically, sorbin and SH3 domain containing 3 (SORBS3) which codes for the adapter protein vinexin was significantly decreased in gene expression (fold change −1.9) and had nine DMCs that were significantly increased in methylation in obesity (methylation differences ranged from 5.0 to 24.4 %). Moreover, differentially methylated region (DMR) analysis identified a region in the 5'UTR (Chr.8:22,423,530–22,423,569) of SORBS3 that was increased in methylation by 11.2 % in the obese group. The negative relationship observed between DNA methylation and gene expression for SORBS3 was validated by a site-specific sequencing approach, pyrosequencing, and qRT-PCR. Additionally, we performed transcription factor binding analysis and identified a number of transcription factors whose binding to the differentially methylated sites or region may contribute to obesity. Conclusions These results demonstrate that obesity alters the epigenome through DNA methylation and highlights novel transcriptomic changes in SORBS3 in skeletal muscle.The electronic version of this article is the complete one and can be found online at: https://clinicalepigeneticsjournal.biomedcentral.com/articles/10.1186/s13148-016-0246-

    Maximizing ecological and evolutionary insight in bisulfite sequencing data sets

    Get PDF
    Genome-scale bisulfite sequencing approaches have opened the door to ecological and evolutionary studies of DNA methylation in many organisms. These approaches can be powerful. However, they introduce new methodological and statistical considerations, some of which are particularly relevant to non-model systems. Here, we highlight how these considerations influence a study’s power to link methylation variation with a predictor variable of interest. Relative to current practice, we argue that sample sizes will need to increase to provide robust insights. We also provide recommendations for overcoming common challenges and an R Shiny app to aid in study design

    Strategies for analyzing bisulfite sequencing data

    Get PDF
    DNA methylation is one of the main epigenetic modifications in the eukaryotic genome and has been shown to play a role in cell-type specific regulation of gene expression, and therefore cell-type identity. Bisulfite sequencing is the gold-standard for measuring methylation over the genomes of interest. Here, we review several techniques used for the analysis of high-throughput bisulfite sequencing. We introduce specialized short-read alignment techniques as well as pre/post-alignment quality check methods to ensure data quality. Furthermore, we discuss subsequent analysis steps after alignment. We introduce various differential methylation methods and compare their performance using simulated and real bisulfite-sequencing datasets. We also discuss the methods used to segment methylomes in order to pinpoint regulatory regions. We introduce annotation methods that can be used further classification of regions returned by segmentation or differential methylation methods. Lastly, we review software packages that implement strategies to efficiently deal with large bisulfite sequencing datasets locally and also discuss online analysis workflows that do not require any prior programming skills. The analysis strategies described in this review will guide researchers at any level to the best practices of bisulfite sequencing analysis

    Strategies for analyzing bisulfite sequencing data

    Get PDF
    DNA methylation is one of the main epigenetic modifications in the eukaryotic genome; it has been shown to play a role in cell-type specific regulation of gene expression, and therefore cell-type identity. Bisulfite sequencing is the gold-standard for measuring methylation over the genomes of interest. Here, we review several techniques used for the analysis of high-throughput bisulfite sequencing. We introduce specialized short-read alignment techniques as well as pre/post-alignment quality check methods to ensure data quality. Furthermore, we discuss subsequent analysis steps after alignment. We introduce various differential methylation methods and compare their performance using simulated and real bisulfite sequencing datasets. We also discuss the methods used to segment methylomes in order to pinpoint regulatory regions. We introduce annotation methods that can be used for further classification of regions returned by segmentation and differential methylation methods. Finally, we review software packages that implement strategies to efficiently deal with large bisulfite sequencing datasets locally and we discuss online analysis workflows that do not require any prior programming skills. The analysis strategies described in this review will guide researchers at any level to the best practices of bisulfite sequencing analysis

    DiffVar: a new method for detecting differential variability with application to methylation in cancer and aging

    Get PDF
    Methylation of DNA is known to be essential to development and dramatically altered in cancers. The Illumina HumanMethylation450 BeadChip has been used extensively as a cost-effective way to profile nearly half a million CpG sites across the human genome. Here we present DiffVar, a novel method to test for differential variability between sample groups. DiffVar employs an empirical Bayes model framework that can take into account any experimental design and is robust to outliers. We applied DiffVar to several datasets from The Cancer Genome Atlas, as well as an aging dataset. DiffVar is available in the missMethyl Bioconductor R package

    Investigation of DNA Methylation in Obesity and its Underlying Insulin Resistance

    Get PDF
    abstract: Obesity and its underlying insulin resistance are caused by environmental and genetic factors. DNA methylation provides a mechanism by which environmental factors can regulate transcriptional activity. The overall goal of the work herein was to (1) identify alterations in DNA methylation in human skeletal muscle with obesity and its underlying insulin resistance, (2) to determine if these changes in methylation can be altered through weight-loss induced by bariatric surgery, and (3) to identify DNA methylation biomarkers in whole blood that can be used as a surrogate for skeletal muscle. Assessment of DNA methylation was performed on human skeletal muscle and blood using reduced representation bisulfite sequencing (RRBS) for high-throughput identification and pyrosequencing for site-specific confirmation. Sorbin and SH3 homology domain 3 (SORBS3) was identified in skeletal muscle to be increased in methylation (+5.0 to +24.4 %) in the promoter and 5’untranslated region (UTR) in the obese participants (n= 10) compared to lean (n=12), and this finding corresponded with a decrease in gene expression (fold change: -1.9, P=0.0001). Furthermore, SORBS3 was demonstrated in a separate cohort of morbidly obese participants (n=7) undergoing weight-loss induced by surgery, to decrease in methylation (-5.6 to -24.2%) and increase in gene expression (fold change: +1.7; P=0.05) post-surgery. Moreover, SORBS3 promoter methylation was demonstrated in vitro to inhibit transcriptional activity (P=0.000003). The methylation and transcriptional changes for SORBS3 were significantly (P≤0.05) correlated with obesity measures and fasting insulin levels. SORBS3 was not identified in the blood methylation analysis of lean (n=10) and obese (n=10) participants suggesting that it is a muscle specific marker. However, solute carrier family 19 member 1 (SLC19A1) was identified in blood and skeletal muscle to have decreased 5’UTR methylation in obese participants, and this was significantly (P≤0.05) predicted by insulin sensitivity. These findings suggest SLC19A1 as a potential blood-based biomarker for obese, insulin resistant states. The collective findings of SORBS3 DNA methylation and gene expression present an exciting novel target in skeletal muscle for further understanding obesity and its underlying insulin resistance. Moreover, the dynamic changes to SORBS3 in response to metabolic improvements and weight-loss induced by surgery.Dissertation/ThesisAppendix AAppendix BAppendix CAppendix DAppendix GDoctoral Dissertation Biology 201

    Applied Data Science Methods in Epitranscriptomic Bioinformatics

    Get PDF
    Chemical modifications on messenger RNA have been recently revealed by biological researchers to function as an essential layer of gene expression regulation. Molecular biologists from different laboratories have conducted more than 200 sets of high throughput sequencing experiments trying to capture the types and locations of messenger RNA modifications across multiple cell types and species. However, until this date, the field still lacks a bioinformatics pipeline to quantify and analyze the epitranscriptomic HTS data generated from different laboratories consistently. The thesis aims to provide an overview of questions and challenges arisen in the field of mRNA modification computational analysis. Subsequently, we will present a set of practical computational strategies for data explorations, genomic data mining, modification level quantifications, and technical artifact corrections from a data science perspective. The first chapter of the thesis provides an in-depth data exploration and visualization of m5C mRNA modification from bisulfite sequencing data. In the second chapter, we document the database construction and data consistency exploration for the transcriptomic targets of the mRNA modification related protein regulators. Besides, the second chapter presents a methodological framework for the computational representation of the domain knowledge related to the transcriptomic topology of epitranscriptomic modification. The final section of the thesis discusses the dominant technical biases existed in MeRIP-Seq, the most widely applied type of HTS data in epitranscriptomics, and it follows with a practical computational pipeline to overcome the technical error