4 research outputs found
BoostMe accurately predicts DNA methylation values in whole-genome bisulfite sequencing of multiple human tissues
Abstract
Background
Bisulfite sequencing is widely employed to study the role of DNA methylation in disease; however, the data suffer from biases due to coverage depth variability. Imputation of methylation values at low-coverage sites may mitigate these biases while also identifying important genomic features associated with predictive power.
Results
Here we describe BoostMe, a method for imputing low-quality DNA methylation estimates within whole-genome bisulfite sequencing (WGBS) data. BoostMe uses a gradient boosting algorithm, XGBoost, and leverages information from multiple samples for prediction. We find that BoostMe outperforms existing algorithms in speed and accuracy when applied to WGBS of human tissues. Furthermore, we show that imputation improves concordance between WGBS and the MethylationEPIC array at low WGBS depth, suggesting improved WGBS accuracy after imputation.
Conclusions
Our findings support the use of BoostMe as a preprocessing step for WGBS analysis.https://deepblue.lib.umich.edu/bitstream/2027.42/143848/1/12864_2018_Article_4766.pd
Recommended from our members
Integrated genomic and molecular characterization of cervical cancer.
Cervical cancer remains one of the leading causes of cancer-related deaths worldwide. Here we report the extensive molecular characterization of 228 primary cervical cancers, one of the largest comprehensive genomic studies of cervical cancer to date. We observed notable APOBEC mutagenesis patterns and identified SHKBP1, ERBB3, CASP8, HLA-A and TGFBR2 as novel significantly mutated genes in cervical cancer. We also discovered amplifications in immune targets CD274 (also known as PD-L1) and PDCD1LG2 (also known as PD-L2), and the BCAR4 long non-coding RNA, which has been associated with response to lapatinib. Integration of human papilloma virus (HPV) was observed in all HPV18-related samples and 76% of HPV16-related samples, and was associated with structural aberrations and increased target-gene expression. We identified a unique set of endometrial-like cervical cancers, comprised predominantly of HPV-negative tumours with relatively high frequencies of KRAS, ARID1A and PTEN mutations. Integrative clustering of 178 samples identified keratin-low squamous, keratin-high squamous and adenocarcinoma-rich subgroups. These molecular analyses reveal new potential therapeutic targets for cervical cancers
Recommended from our members
Integrated genomic and molecular characterization of cervical cancer.
Cervical cancer remains one of the leading causes of cancer-related deaths worldwide. Here we report the extensive molecular characterization of 228 primary cervical cancers, one of the largest comprehensive genomic studies of cervical cancer to date. We observed notable APOBEC mutagenesis patterns and identified SHKBP1, ERBB3, CASP8, HLA-A and TGFBR2 as novel significantly mutated genes in cervical cancer. We also discovered amplifications in immune targets CD274 (also known as PD-L1) and PDCD1LG2 (also known as PD-L2), and the BCAR4 long non-coding RNA, which has been associated with response to lapatinib. Integration of human papilloma virus (HPV) was observed in all HPV18-related samples and 76% of HPV16-related samples, and was associated with structural aberrations and increased target-gene expression. We identified a unique set of endometrial-like cervical cancers, comprised predominantly of HPV-negative tumours with relatively high frequencies of KRAS, ARID1A and PTEN mutations. Integrative clustering of 178 samples identified keratin-low squamous, keratin-high squamous and adenocarcinoma-rich subgroups. These molecular analyses reveal new potential therapeutic targets for cervical cancers