291 research outputs found

    A comparison of feature selection and classification methods in DNA methylation studies using the Illumina Infinium platform

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The 27k Illumina Infinium Methylation Beadchip is a popular high-throughput technology that allows the methylation state of over 27,000 CpGs to be assayed. While feature selection and classification methods have been comprehensively explored in the context of gene expression data, relatively little is known as to how best to perform feature selection or classification in the context of Illumina Infinium methylation data. Given the rising importance of epigenomics in cancer and other complex genetic diseases, and in view of the upcoming epigenome wide association studies, it is critical to identify the statistical methods that offer improved inference in this novel context.</p> <p>Results</p> <p>Using a total of 7 large Illumina Infinium 27k Methylation data sets, encompassing over 1,000 samples from a wide range of tissues, we here provide an evaluation of popular feature selection, dimensional reduction and classification methods on DNA methylation data. Specifically, we evaluate the effects of variance filtering, supervised principal components (SPCA) and the choice of DNA methylation quantification measure on downstream statistical inference. We show that for relatively large sample sizes feature selection using test statistics is similar for M and β-values, but that in the limit of small sample sizes, M-values allow more reliable identification of true positives. We also show that the effect of variance filtering on feature selection is study-specific and dependent on the phenotype of interest and tissue type profiled. Specifically, we find that variance filtering improves the detection of true positives in studies with large effect sizes, but that it may lead to worse performance in studies with smaller yet significant effect sizes. In contrast, supervised principal components improves the statistical power, especially in studies with small effect sizes. We also demonstrate that classification using the Elastic Net and Support Vector Machine (SVM) clearly outperforms competing methods like LASSO and SPCA. Finally, in unsupervised modelling of cancer diagnosis, we find that non-negative matrix factorisation (NMF) clearly outperforms principal components analysis.</p> <p>Conclusions</p> <p>Our results highlight the importance of tailoring the feature selection and classification methodology to the sample size and biological context of the DNA methylation study. The Elastic Net emerges as a powerful classification algorithm for large-scale DNA methylation studies, while NMF does well in the unsupervised context. The insights presented here will be useful to any study embarking on large-scale DNA methylation profiling using Illumina Infinium beadarrays.</p

    Prevention in the age of personal responsibility:Epigenetic risk-predictive screening for female cancers as a case study

    Get PDF
    Epigenetic markers could potentially be used for risk assessment in risk-stratified population-based cancer screening programmes. Whereas current screening programmes generally aim to detect existing cancer, epigenetic markers could be used to provide risk estimates for not-yet-existing cancers. Epigenetic risk-predictive tests may thus allow for new opportunities for risk assessment for developing cancer in the future. Since epigenetic changes are presumed to be modifiable, preventive measures, such as lifestyle modification, could be used to reduce the risk of cancer. Moreover, epigenetic markers might be used to monitor the response to risk-reducing interventions. In this article, we address ethical concerns related to personal responsibility raised by epigenetic risk-predictive tests in cancer population screening. Will individuals increasingly be held responsible for their health, that is, will they be held accountable for bad health outcomes? Will they be blamed or subject to moral sanctions? We will illustrate these ethical concerns by means of a Europe-wide research programme that develops an epigenetic risk-predictive test for female cancers. Subsequently, we investigate when we can hold someone responsible for her actions. We argue that the standard conception of personal responsibility does not provide an appropriate framework to address these concerns. A different, prospective account of responsibility meets part of our concerns, that is, concerns about inequality of opportunities, but does not meet all our concerns about personal responsibility. We argue that even if someone is responsible on grounds of a negative and/or prospective account of responsibility, there may be moral and practical reasons to abstain from moral sanctions

    The WID-CIN test identifies women with, and at risk of, cervical intraepithelial neoplasia grade 3 and invasive cervical cancer

    Get PDF
    BACKGROUND: Cervical screening is transitioning from primary cytology to primary human papillomavirus (HPV) testing. HPV testing is highly sensitive but there is currently no high-specificity triage method for colposcopy referral to detect cervical intraepithelial neoplasia grade 3 or above (CIN3+) in women positive for high-risk (hr) HPV subtypes. An objective, automatable test that could accurately perform triage, independently of sample heterogeneity and age, is urgently required. METHODS: We analyzed DNA methylation at ~850,000 CpG sites across the genome in a total of 1254 cervical liquid-based cytology (LBC) samples from cases of screen-detected histologically verified CIN1-3+ (98% hrHPV-positive) and population-based control women free from any cervical disease (100% hrHPV-positive). Samples were provided by a state-of-the-art population-based cohort biobank and consisted of (i) a discovery set of 170 CIN3+ cases and 202 hrHPV-positive/cytology-negative controls; (ii) a diagnostic validation set of 87 CIN3+, 90 CIN2, 166 CIN1, and 111 hrHPV-positive/cytology-negative controls; and (iii) a predictive validation set of 428 cytology-negative samples (418 hrHPV-positive) of which 210 were diagnosed with CIN3+ in the upcoming 1-4 years and 218 remained disease-free. RESULTS: We developed the WID-CIN (Women's cancer risk IDentification-Cervical Intraepithelial Neoplasia) test, a DNA methylation signature consisting of 5000 CpG sites. The receiver operating characteristic area under the curve (AUC) in the independent diagnostic validation set was 0.92 (95% CI 0.88-0.96). At 75% specificity (≤CIN1), the overall sensitivity to detect CIN3+ is 89.7% (83.3-96.1) in all and 92.7% (85.9-99.6) and 65.6% (49.2-82.1) in women aged ≥30 and <30. In hrHPV-positive/cytology-negative samples in the predictive validation set, the WID-CIN detected 54.8% (48.0-61.5) cases developing 1-4 years after sample donation in all ages or 56.9% (47.6-66.2) and 53.5% (43.7-63.2) in ≥30 and <30-year-old women, at a specificity of 75%. CONCLUSIONS: The WID-CIN test identifies the vast majority of hrHPV-positive women with current CIN3+ lesions. In the absence of cytologic abnormalities, a positive WID-CIN test result is likely to indicate a significantly increased risk of developing CIN3+ in the near future

    HPV-induced host epigenetic reprogramming is lost upon progression to high-grade cervical intraepithelial neoplasia

    Get PDF
    The impact of a pathogen on host disease can only be studied in samples covering the entire spectrum of pathogenesis. Persistent oncogenic human papilloma virus (HPV) infection is the most common cause for cervical cancer. Here, we investigate HPV-induced host epigenome-wide changes prior to development of cytological abnormalities. Using cervical sample methylation array data from disease-free women with or without an oncogenic HPV infection, we develop the WID (Women's cancer risk identification)-HPV, a signature reflective of changes in the healthy host epigenome related to high-risk HPV strains (AUC = 0.78, 95% CI: 0.72-0.85, in nondiseased women). Looking at HPV-associated changes across disease development, HPV-infected women with minor cytological alterations (cervical intraepithelial neoplasia grade 1/2, CIN1/2), but surprisingly not those with precancerous changes or invasive cervical cancer (CIN3+), show an increased WID-HPV index, indicating the WID-HPV may reflect a successful viral clearance response absent in progression to cancer. Further investigation revealed the WID-HPV is positively associated with apoptosis (ρ = 0.48; P < .001) and negatively associated with epigenetic replicative age (ρ = −0.43; P < .001). Taken together, our data suggest the WID-HPV captures a clearance response associated with apoptosis of HPV-infected cells. This response may be dampened or lost with increased underlying replicative age of infected cells, resulting in progression to cancer

    DNA methylation at quantitative trait loci (mQTLs) varies with cell type and nonheritable factors and may improve breast cancer risk assessment

    Get PDF
    To individualise breast cancer (BC) prevention, markers to follow a person’s changing environment and health extending beyond static genetic risk scores are required. Here, we analysed cervical and breast DNA methylation (n = 1848) and single nucleotide polymorphisms (n = 1442) and demonstrate that a linear combination of methylation levels at 104 BC-associated methylation quantitative trait loci (mQTL) CpGs, termed the WID™-qtBC index, can identify women with breast cancer in hormone-sensitive tissues (AUC = 0.71 [95% CI: 0.65–0.77] in cervical samples). Women in the highest combined risk group (high polygenic risk score and WID™-qtBC) had a 9.6-fold increased risk for BC [95% CI: 4.7–21] compared to the low-risk group and tended to present at more advanced stages. Importantly, the WID™-qtBC is influenced by non-genetic BC risk factors, including age and body mass index, and can be modified by a preventive pharmacological intervention, indicating an interaction between genome and environment recorded at the level of the epigenome. Our findings indicate that methylation levels at mQTLs in relevant surrogate tissues could enable integration of heritable and non-heritable factors for improved disease risk stratification

    Correlation of an epigenetic mitotic clock with cancer risk.

    Get PDF
    BACKGROUND: Variation in cancer risk among somatic tissues has been attributed to variations in the underlying rate of stem cell division. For a given tissue type, variable cancer risk between individuals is thought to be influenced by extrinsic factors which modulate this rate of stem cell division. To date, no molecular mitotic clock has been developed to approximate the number of stem cell divisions in a tissue of an individual and which is correlated with cancer risk. RESULTS: Here, we integrate mathematical modeling with prior biological knowledge to construct a DNA methylation-based age-correlative model which approximates a mitotic clock in both normal and cancer tissue. By focusing on promoter CpG sites that localize to Polycomb group target genes that are unmethylated in 11 different fetal tissue types, we show that increases in DNA methylation at these sites defines a tick rate which correlates with the estimated rate of stem cell division in normal tissues. Using matched DNA methylation and RNA-seq data, we further show that it correlates with an expression-based mitotic index in cancer tissue. We demonstrate that this mitotic-like clock is universally accelerated in cancer, including pre-cancerous lesions, and that it is also accelerated in normal epithelial cells exposed to a major carcinogen. CONCLUSIONS: Unlike other epigenetic and mutational clocks or the telomere clock, the epigenetic clock proposed here provides a concrete example of a mitotic-like clock which is universally accelerated in cancer and precancerous lesions

    Epigenotyping in Peripheral Blood Cell DNA and Breast Cancer Risk: A Proof of Principle Study

    Get PDF
    Background: Epigenetic changes are emerging as one of the most important events in carcinogenesis. Two alterations in the pattern of DNA methylation in breast cancer (BC) have been previously reported; active estrogen receptor-a (ER-a) is associated with decreased methylation of ER-a target (ERT) genes, and polycomb group target (PCGT) genes are more likely than other genes to have promoter DNA hypermethylation in cancer. However, whether DNA methylation in normal unrelated cells is associated with BC risk and whether these imprints can be related to factors which can be modified by the environment, is unclear.Methodology/Principal Findings: Using quantitative methylation analysis in a case-control study (n = 1,083) we found that DNA methylation of peripheral blood cell DNA provides good prediction of BC risk. We also report that invasive ductal and invasive lobular BC is characterized by two different sets of genes, the latter particular by genes involved in the differentiation of the mesenchyme (PITX2, TITF1, GDNF and MYOD1). Finally we demonstrate that only ERT genes predict ER positive BC; lack of peripheral blood cell DNA methylation of ZNF217 predicted BC independent of age and family history (odds ratio 1.49; 95% confidence interval 1.12-1.97; P = 0.006) and was associated with ER-a bioactivity in the corresponding serum.Conclusion/Significance: This first large-scale epigenotyping study demonstrates that DNA methylation may serve as a link between the environment and the genome. Factors that can be modulated by the environment (like estrogens) leave an imprint in the DNA of cells that are unrelated to the target organ and indicate the predisposition to develop a cancer. Further research will need to demonstrate whether DNA methylation profiles will be able to serve as a new tool to predict the risk of developing chronic diseases with sufficient accuracy to guide preventive measures

    Performance of the WID-qEC test versus sonography to detect uterine cancers in women with abnormal uterine bleeding (EPI-SURE): a prospective, consecutive observational cohort study in the UK

    Get PDF
    BACKGROUND: To detect uterine cancer, simpler and more specific index tests are needed to triage women with abnormal uterine bleeding to a reference histology test. We aimed to compare the performance of conventional index imaging tests with the novel WID-qEC DNA methylation test in terms of detecting the presence or absence of uterine cancers in women with abnormal uterine bleeding. METHODS: EPI-SURE was a prospective, observational study that invited all women aged 45 years and older with abnormal uterine bleeding attending a tertiary gynaecological diagnostic referral centre at University College London Hospital (London, UK) to participate. Women meeting these inclusion criteria who consented to participate were included. Pregnant women and those with previous hysterectomy were excluded. A cervicovaginal sample for the WID-qEC test was obtained before standard assessment using index imaging tests (ie, ultrasound) and, where applicable, reference histology (ie, biopsy, hysteroscopy, or both) was performed. Technicians performing the WID-qEC test were masked to the final clinical outcome. The result of the WID-qEC test is defined as the sum of the percentage of fully methylated reference (ΣPMR) of the ZSCAN12 and GYPC regions. Patients were followed until diagnostic resolution or until June 12, 2023. The primary outcome was to assess the real-world performance of the WID-qEC test in comparison with ultrasound with regard to the area under the receiver-operating-characteristic curve (AUC), sensitivity, specificity, and positive and negative predictive values. EPI-SURE is registered with ISRCTN (16815568). FINDINGS: From June 1, 2022, to Nov 24, 2022, 474 women were deemed eligible to participate. 74 did not accept the invitation to participate, and one woman withdrew after providing consent. 399 women were included in the primary analysis cohort. Based on 603 index imaging tests, 186 (47%) women were recommended for a reference histology test (ie, biopsy, hysteroscopy, or both). 12 women were diagnosed with cancer, 375 were not diagnosed with cancer, and 12 had inconclusive clinical outcomes and were considered study dropouts. 198 reference histology test procedures detected nine cases of cancer and missed two; one further cancer was directly diagnosed at hysterectomy without a previous reference test. The AUC for detection of uterine cancer based on endometrial thickness in mm was 87·2% (95% CI 71·1-100·0) versus 94·3% (84·7-100·0) based on WID-qEC (p=0·48). Endometrial thickness assessment on ultrasound scan was possible in 379 (95%) of the 399 women and a prespecified cut-off of 4·5 mm or more showed a sensitivity of 90·9% (95% CI 62·3-98·4), a specificity of 79·1% (74·5-82·9), a positive predictive value of 11·8% (6·5-20·3), and a negative predictive value of 99·6% (98·0-99·9). The WID-qEC test was possible in 390 (98%) of the 399 patients with a sensitivity of 90·9% (95% CI 62·3-98·4), a specificity of 92·1% (88·9-94·4), a positive predictive value of 25·6% (14·6-41·1), and a negative predictive value of 99·7% (98·3-99·9), when the prespecified threshold of 0·03 ΣPMR or more was applied. When a higher threshold (≥0·3 ΣPMR) was applied the specificity increased to 97·3% (95% CI 95·1-98·5) without a change in sensitivity. INTERPRETATION: The WID-qEC test delivers fast results and shows improved performance compared with a combination of imaging index tests. Triage of women with abnormal uterine bleeding using the WID-qEC test could reduce the number of women requiring histological assessments for identification of potential malignancy and specifically reduce the false positive rate. FUNDING: The Eve Appeal, Land Tirol, and the European Research Council under the European Union's Horizon 2020 Research and Innovation Programme

    Risk algorithm using serial biomarker measurements doubles the number of screen-detected cancers compared with a single-threshold rule in the United Kingdom collaborative trial of ovarian cancer screening

    Get PDF
    PURPOSE: Cancer screening strategies have commonly adopted single-biomarker thresholds to identify abnormality. We investigated the impact of serial biomarker change interpreted through a risk algorithm on cancer detection rates. PATIENTS AND METHODS: In the United Kingdom Collaborative Trial of Ovarian Cancer Screening, 46,237 women, age 50 years or older underwent incidence screening by using the multimodal strategy (MMS) in which annual serum cancer antigen 125 (CA-125) was interpreted with the risk of ovarian cancer algorithm (ROCA). Women were triaged by the ROCA: normal risk, returned to annual screening; intermediate risk, repeat CA-125; and elevated risk, repeat CA-125 and transvaginal ultrasound. Women with persistently increased risk were clinically evaluated. All participants were followed through national cancer and/or death registries. Performance characteristics of a single-threshold rule and the ROCA were compared by using receiver operating characteristic curves. RESULTS: After 296,911 women-years of annual incidence screening, 640 women underwent surgery. Of those, 133 had primary invasive epithelial ovarian or tubal cancers (iEOCs). In all, 22 interval iEOCs occurred within 1 year of screening, of which one was detected by ROCA but was managed conservatively after clinical assessment. The sensitivity and specificity of MMS for detection of iEOCs were 85.8% (95% CI, 79.3% to 90.9%) and 99.8% (95% CI, 99.8% to 99.8%), respectively, with 4.8 surgeries per iEOC. ROCA alone detected 87.1% (135 of 155) of the iEOCs. Using fixed CA-125 cutoffs at the last annual screen of more than 35, more than 30, and more than 22 U/mL would have identified 41.3% (64 of 155), 48.4% (75 of 155), and 66.5% (103 of 155), respectively. The area under the curve for ROCA (0.915) was significantly (P = .0027) higher than that for a single-threshold rule (0.869). CONCLUSION: Screening by using ROCA doubled the number of screen-detected iEOCs compared with a fixed cutoff. In the context of cancer screening, reliance on predefined single-threshold rules may result in biomarkers of value being discarded
    corecore