20 research outputs found
methylKit: a comprehensive R package for the analysis of genome-wide DNA methylation profiles
DNA methylation is a chemical modification of cytosine bases that is pivotal for gene regulation, cellular specification and cancer development. Here, we describe an R package, methylKit, that rapidly analyzes genome-wide cytosine epigenetic profiles from high-throughput methylation and hydroxymethylation sequencing experiments. methylKit includes functions for clustering, sample quality visualization, differential methylation analysis and annotation features, thus automating and simplifying many of the steps for discerning statistically significant bases or regions of DNA methylation. Finally, we demonstrate methylKit on breast cancer data, in which we find statistically significant regions of differential methylation and stratify tumor subtypes. methylKit is available at http://code.google.com/p/methylkit
Recommended from our members
Aberration in DNA Methylation in B-Cell Lymphomas Has a Complex Origin and Increases with Disease Severity
Despite mounting evidence that epigenetic abnormalities play a key role in cancer biology, their contributions to the malignant phenotype remain poorly understood. Here we studied genome-wide DNA methylation in normal B-cell populations and subtypes of B-cell non-Hodgkin lymphoma: follicular lymphoma and diffuse large B-cell lymphomas. These lymphomas display striking and progressive intra-tumor heterogeneity and also inter-patient heterogeneity in their cytosine methylation patterns. Epigenetic heterogeneity is initiated in normal germinal center B-cells, increases markedly with disease aggressiveness, and is associated with unfavorable clinical outcome. Moreover, patterns of abnormal methylation vary depending upon chromosomal regions, gene density and the status of neighboring genes. DNA methylation abnormalities arise via two distinct processes: i) lymphomagenic transcriptional regulators perturb promoter DNA methylation in a target gene-specific manner, and ii) aberrant epigenetic states tend to spread to neighboring promoters in the absence of CTCF insulator binding sites
Base-Pair Resolution DNA Methylation Sequencing Reveals Profoundly Divergent Epigenetic Landscapes in Acute Myeloid Leukemia
We have developed an enhanced form of reduced representation bisulfite sequencing with extended genomic coverage, which resulted in greater capture of DNA methylation information of regions lying outside of traditional CpG islands. Applying this method to primary human bone marrow specimens from patients with Acute Myelogeneous Leukemia (AML), we demonstrated that genetically distinct AML subtypes display diametrically opposed DNA methylation patterns. As compared to normal controls, we observed widespread hypermethylation in IDH mutant AMLs, preferentially targeting promoter regions and CpG islands neighboring the transcription start sites of genes. In contrast, AMLs harboring translocations affecting the MLL gene displayed extensive loss of methylation of an almost mutually exclusive set of CpGs, which instead affected introns and distal intergenic CpG islands and shores. When analyzed in conjunction with gene expression profiles, it became apparent that these specific patterns of DNA methylation result in differing roles in gene expression regulation. However, despite this subtype-specific DNA methylation patterning, a much smaller set of CpG sites are consistently affected in both AML subtypes. Most CpG sites in this common core of aberrantly methylated CpGs were hypermethylated in both AML subtypes. Therefore, aberrant DNA methylation patterns in AML do not occur in a stereotypical manner but rather are highly specific and associated with specific driving genetic lesions
Clinically relevant patient clusters identified by machine learning from the clinical development programme of secukinumab in psoriatic arthritis
Objectives Identify distinct clusters of psoriatic arthritis (PsA) patients based on their baseline articular, entheseal and cutaneous disease manifestations and explore their clinical and therapeutic value.
Methods Pooled baseline data in PsA patients (n=1894) treated with secukinumab across four phase 3 studies (FUTURE 2–5) were analysed to determine phenotypes based on clusters of clinical indicators. Finite mixture models methodology was applied to generate clinical clusters and mean longitudinal responses were compared between secukinumab doses (300 vs 150 mg) across identified clusters and clinical indicators through week 52 using machine learning (ML) techniques.
Results Seven distinct patient clusters were identified. Cluster 1 (very-high (VH) – SWO/TEN (swollen/tender); n=187) was characterised by VH polyarticular burden for both tenderness and swelling of joints, while cluster 2 (H (high) – TEN; n=251) was marked by high polyarticular burden in tender joints and cluster 3 (H – Feet – Dactylitis; n=175) by high burden in joints of feet and dactylitis. For cluster 4 (L (Low) – Nails – Skin; n=209), cluster 5 (L – skin; n=283), cluster 6 (L – Nails; n=294) and cluster 7 (L; n=495) articular burden was low but nail and skin involvement was variable, with cluster 7 marked by mild disease activity across all domains. Greater improvements in the longitudinal responses for enthesitis in cluster 2, enthesitis and Psoriasis Area and Severity Index (PASI) in cluster 4 and PASI in cluster 6 were shown for secukinumab 300 mg compared with 150 mg.
Conclusions PsA clusters identified by ML follow variable response trajectories indicating their potential to predict precise impact on patients’ outcomes.
Trial registration numbers NCT01752634, NCT01989468, NCT02294227, NCT0240435
DNA Methylation Dynamics of Germinal Center B Cells Are Mediated by AID
Changes in DNA methylation are required for the formation of germinal centers (GCs), but the mechanisms of such changes are poorly understood. Activation-induced cytidine deaminase (AID) has been recently implicated in DNA demethylation through its deaminase activity coupled with DNA repair. We investigated the epigenetic function of AID in vivo in germinal center B cells (GCBs) isolated from wild-type (WT) and AID-deficient (Aicda−/−) mice. We determined that the transit of B cells through the GC is associated with marked locus-specific loss of methylation and increased methylation diversity, both of which are lost in Aicda−/− animals. Differentially methylated cytosines (DMCs) between GCBs and naive B cells (NBs) are enriched in genes that are targeted for somatic hypermutation (SHM) by AID, and these genes form networks required for B cell development and proliferation. Finally, we observed significant conservation of AID-dependent epigenetic reprogramming between mouse and human B cells
Genome-wide patterns of aberrant methylation.
<p>(A) Graphical explanation of how the distribution of M-scores and IQR are transformed into violin distribution plots to enable more efficient visualization and comparison on intra- and inter-sample variability. (B) Distribution of the methylation score (M-score, left) and inter-quartile ranges (IQR, right) at probesets in centromeric, telomeric, and intermediate regions for normal and diseased tissues. Bar width is proportional to the number of data points, and the colors are the same as in <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1003137#pgen-1003137-g001" target="_blank">Figure 1A</a>. (C) Distributions of M-score (left) and IQR (right) are shown for gene-poor, gene-rich, and intermediate regions.</p
Spreading of aberrant methylation to neighboring probesets in the ABC samples.
<p>(A) A schematic representation of how the genome was divided into blocks of genes to study spreading of altered DNA methylation. (B–C) Analysis of spreading of aberrant methylation within genomic neighborhoods. Loci “<i>i</i>” represent probesets that are significantly hypo- (black) or hyper-methylated (grey) in lymphoma samples compared to normal tissues, and loci “<i>i</i>±<i>j</i>” represent both the (<i>i</i>+<i>j</i>)-th and (<i>i</i>−<i>j</i>)-th neighbors of those probesets. For instance, when we focused on probeset #10 (i.e. <i>i</i> = 10), we analyzed spreading of aberrant methylation at probesets #5, 6, 7, 8, 9, 11, 12, 13, 14 and 15. Panel B displays the change in methylation states while panel C shows the change in IQR (variability between samples).</p
The insulator factor CTCF prevents spreading of aberrant methylation.
<p>(A) Methylation heterogeneity depends on the density of CTCF-binding sites. Methylation state (M-score, left) and inter-sample methylation variation (IQR, right) are shown for CTCF-BS-poor, CTCF-BS-rich, and intermediate regions. (B) Spreading of aberrant methylation from genomic position “<i>i</i>” to “<i>i</i>±1” (i.e. two neighboring sites) when at least one CTCF-BS is present (black vertical dotted line) and when no CTCF-BS is present (light grey vertical dotted line) between “<i>i</i>” and “<i>i</i>±1”, for aberrant hypo-methylation (two left panels) and aberrant hyper-methylation (two right panels). The presence of CTCF-BS more efficiently restricts the spreading of aberrant hypo-methylation. (C) A schematic overview showing spreading of abnormal methylation in the absence of CTCF-binding sites in genomic neighborhood.</p
Genomic localization of transcriptional regulators and AICDA associates with sites of aberrant DNA methylation.
<p>(A–D) Methylation heterogeneity of promoters of genes that are targets of master regulators. The panels display the distribution of methylation scores (M-scores) for promoters of target genes of (A) BCL6, (B) MYC, (C) EZH2, and (D) AICDA. (E) A schematic overview showing targeted abnormal promoter methylation by master regulators such as MYC, BCL6, EZH2 and AICDA in the lymphoma subtypes.</p
The extent of DNA methylation aberration is predictive of patient survival.
<p>(A) Phylogenetic tree, as estimated based on the correlation of group-averaged M-scores. Departure from normal methylation patterns is correlated with disease severity of the lymphoma samples. (B–C) Kaplan-Meier curves for risk groups defined according to their methylation distance score (i.e. distance from normal B-cells), which reflects how different a sample's methylation profile is from that of NBC or NGC, for all DLBCL (GCB and ABC) samples. (B) Multivariate analysis with the International Prognostic Index (IPI) and distance to NBC. (C) Only IPI.</p