17 research outputs found

    Schematic illustration of proportion of discorant reads (PDR).

    No full text
    Schematic illustration of proportion of discorant reads (PDR).</p

    Association between methylation entropy and cancer stemness.

    No full text
    (A) Genes were ranked by the Pearson’s correlation between their expression and average methylation entropy levels across promoters. Red dots represent 3,680 genes having statistically significant correlations (Benjamini-Hochberg adjusted p-value WNT7A and CTNND2). (E) The association between promoter methylation entropy levels and the activity of Wnt signaling pathway. *two-tailed independent t-test p < 0.05; In D-E, Pearson’s correlation coefficients and associated p-values are shown. In D, p-values were adjusted using Benjamini-Hochberg procedure.</p

    Performance comparison with WSHPackage using multiple threads for 20M RRBS-simulated reads.

    No full text
    Performance comparison with WSHPackage using multiple threads for 20M RRBS-simulated reads.</p

    Association between stemness of cancer cells and other DNA methylation heterogeneity measures.

    No full text
    Association between stemness of cancer cells and other DNA methylation heterogeneity measures.</p

    Schematic illustration of local pairwise methylation discordance (LPMD).

    No full text
    Schematic illustration of local pairwise methylation discordance (LPMD).</p

    Robustness of LPMD against the choice of genomic distance window.

    No full text
    Robustness of LPMD against the choice of genomic distance window.</p

    Benchmarking the running time and memory usage of Metheor using simulated pseudo-WGBS dataset.

    No full text
    Benchmarking the running time and memory usage of Metheor using simulated pseudo-WGBS dataset.</p

    Promoter PDRs of tumor suppressors and oncogenes.

    No full text
    Phased DNA methylation states within bisulfite sequencing reads are valuable source of information that can be used to estimate epigenetic diversity across cells as well as epigenomic instability in individual cells. Various measures capturing the heterogeneity of DNA methylation states have been proposed for a decade. However, in routine analyses on DNA methylation, this heterogeneity is often ignored by computing average methylation levels at CpG sites, even though such information exists in bisulfite sequencing data in the form of phased methylation states, or methylation patterns. In this study, to facilitate the application of the DNA methylation heterogeneity measures in downstream epigenomic analyses, we present a Rust-based, extremely fast and lightweight bioinformatics toolkit called Metheor. As the analysis of DNA methylation heterogeneity requires the examination of pairs or groups of CpGs throughout the genome, existing softwares suffer from high computational burden, which almost make a large-scale DNA methylation heterogeneity studies intractable for researchers with limited resources. In this study, we benchmark the performance of Metheor against existing code implementations for DNA methylation heterogeneity measures in three different scenarios of simulated bisulfite sequencing datasets. Metheor was shown to dramatically reduce the execution time up to 300-fold and memory footprint up to 60-fold, while producing identical results with the original implementation, thereby facilitating a large-scale study of DNA methylation heterogeneity profiles. To demonstrate the utility of the low computational burden of Metheor, we show that the methylation heterogeneity profiles of 928 cancer cell lines can be computed with standard computing resources. With those profiles, we reveal the association between DNA methylation heterogeneity and various omics features. Source code for Metheor is at https://github.com/dohlee/metheor and is freely available under the GPL-3.0 license.</div

    Details of the algorithms used in Metheor implementation and simulated data preparation used for benchmark.

    No full text
    Details of the algorithms used in Metheor implementation and simulated data preparation used for benchmark.</p

    Performance benchmark and validity of the results.

    No full text
    Benchmarking the running time of Metheor using (A) simulated RRBS dataset and (B) Ewing sarcoma RRBS dataset. Values below the name of each of the measures denote the amount of speedup (in fold) in Metheor compared to its benchmark counterpart. Benchmarking the memory usage of Metheor using (C) simulated RRBS dataset and (D) Ewing sarcoma RRBS dataset. Values below the name of each of the measures denote the amount of memory usage reduction (in fold) in Metheor compared to its benchmark counterpart. All the benchmark experiments were repeated for three times, except for MHL. Lines denote the average wall time and shades represent the 95% confidence interval. The wall time for MHL computation was measured for only once. (E) Validity of the results. CpG-wise (PDR, MHL, FDRP and qFDRP) and CpG quartet-wise (PM and ME) methylation heterogeneity levels were compared between Metheor and the corresponding reference implementations. Pearson’s correlation coefficient and corresponding p-values are shown for FDRP and qFDRP.</p
    corecore