66 research outputs found

    Identifying and correcting epigenetics measurements for systematic sources of variation

    Get PDF
    Abstract Background Methylation measures quantified by microarray techniques can be affected by systematic variation due to the technical processing of samples, which may compromise the accuracy of the measurement process and contribute to bias the estimate of the association under investigation. The quantification of the contribution of the systematic source of variation is challenging in datasets characterized by hundreds of thousands of features. In this study, we introduce a method previously developed for the analysis of metabolomics data to evaluate the performance of existing normalizing techniques to correct for unwanted variation. Illumina Infinium HumanMethylation450K was used to acquire methylation levels in over 421,000 CpG sites for 902 study participants of a case-control study on breast cancer nested within the EPIC cohort. The principal component partial R-square (PC-PR2) analysis was used to identify and quantify the variability attributable to potential systematic sources of variation. Three correcting techniques, namely ComBat, surrogate variables analysis (SVA) and a linear regression model to compute residuals were applied. The impact of each correcting method on the association between smoking status and DNA methylation levels was evaluated, and results were compared with findings from a large meta-analysis. Results A sizeable proportion of systematic variability due to variables expressing ‘batch’ and ‘sample position’ within ‘chip’ was identified, with values of the partial R2 statistics equal to 9.5 and 11.4% of total variation, respectively. After application of ComBat or the residuals’ methods, the contribution was 1.3 and 0.2%, respectively. The SVA technique resulted in a reduced variability due to ‘batch’ (1.3%) and ‘sample position’ (0.6%), and in a diminished variability attributable to ‘chip’ within a batch (0.9%). After ComBat or the residuals’ corrections, a larger number of significant sites (k = 600 and k = 427, respectively) were associated to smoking status than the SVA correction (k = 96). Conclusions The three correction methods removed systematic variation in DNA methylation data, as assessed by the PC-PR2, which lent itself as a useful tool to explore variability in large dimension data. SVA produced more conservative findings than ComBat in the association between smoking and DNA methylation

    Genetic variation in the ADIPOQ gene, adiponectin concentrations and risk of colorectal cancer: a Mendelian Randomization analysis using data from three large cohort studies

    Get PDF
    Higher levels of circulating adiponectin have been related to lower risk of colorectal cancer in several prospective cohort studies, but it remains unclear whether this association may be causal. We aimed to improve causal inference in a Mendelian Randomization meta-analysis using nested case–control studies of the European Prospective Investigation into Cancer and Nutrition (EPIC, 623 cases, 623 matched controls), the Health Professionals Follow-up Study (HPFS, 231 cases, 230 controls) and the Nurses’ Health Study (NHS, 399 cases, 774 controls) with available data on pre-diagnostic adiponectin concentrations and selected single nucleotide polymorphisms in the ADIPOQ gene. We created an ADIPOQ allele score that explained approximately 3% of the interindividual variation in adiponectin concentrations. The ADIPOQ allele score was not associated with risk of colorectal cancer in logistic regression analyses (pooled OR per score-unit unit 0.97, 95% CI 0.91, 1.04). Genetically determined twofold higher adiponectin was not significantly associated with risk of colorectal cancer using the ADIPOQ allele score as instrumental variable (pooled OR 0.73, 95% CI 0.40, 1.34). In a summary instrumental variable analysis (based on previously published data) with higher statistical power, no association between genetically determined twofold higher adiponectin and risk of colorectal cancer was observed (0.99, 95% CI 0.93, 1.06 in women and 0.94, 95% CI 0.88, 1.01 in men). Thus, our study does not support a causal effect of circulating adiponectin on colorectal cancer risk. Due to the limited genetic determination of adiponectin, larger Mendelian Randomization studies are necessary to clarify whether adiponectin is causally related to lower risk of colorectal cancer

    Polygenic Risk Scores for Prediction of Breast Cancer and Breast Cancer Subtypes

    Get PDF
    Stratification of women according to their risk of breast cancer based on polygenic risk scores (PRSs) could improve screening and prevention strategies. Our aim was to develop PRSs, optimized for prediction of estrogen receptor (ER)-specific disease, from the largest available genome-wide association dataset and to empirically validate the PRSs in prospective studies. The development dataset comprised 94,075 case subjects and 75,017 control subjects of European ancestry from 69 studies, divided into training and validation sets. Samples were genotyped using genome-wide arrays, and single-nucleotide polymorphisms (SNPs) were selected by stepwise regression or lasso penalized regression. The best performing PRSs were validated in an independent test set comprising 11,428 case subjects and 18,323 control subjects from 10 prospective studies and 190,040 women from UK Biobank (3,215 incident breast cancers). For the best PRSs (313 SNPs), the odds ratio for overall disease per 1 standard deviation in ten prospective studies was 1.61 (95%CI: 1.57-1.65) with area under receiver-operator curve (AUC) = 0.630 (95%CI: 0.628-0.651). The lifetime risk of overall breast cancer in the top centile of the PRSs was 32.6%. Compared with women in the middle quintile, those in the highest 1% of risk had 4.37- and 2.78-fold risks, and those in the lowest 1% of risk had 0.16- and 0.27-fold risks, of developing ER-positive and ER-negative disease, respectively. Goodness-of-fit tests indicated that this PRS was well calibrated and predicts disease risk accurately in the tails of the distribution. This PRS is a powerful and reliable predictor of breast cancer risk that may improve breast cancer prevention programs.NovartisEli Lilly and CompanyAstraZenecaAbbViePfizer UKCelgeneEisaiGenentechMerck Sharp and DohmeRocheCancer Research UKGovernment of CanadaArray BioPharmaGenome CanadaNational Institutes of HealthEuropean CommissionMinistère de l'Économie, de l’Innovation et des Exportations du QuébecSeventh Framework ProgrammeCanadian Institutes of Health Researc

    Epigenetic Signatures of Cigarette Smoking

    Get PDF
    BACKGROUND: DNA methylation leaves a long-term signature of smoking exposure and is one potential mechanism by which tobacco exposure predisposes to adverse health outcomes, such as cancers, osteoporosis, lung, and cardiovascular disorders. METHODS AND RESULTS: To comprehensively determine the association between cigarette smoking and DNA methylation, we conducted a meta-analysis of genome-wide DNA methylation assessed using the Illumina BeadChip 450K array on 15 907 blood-derived DNA samples from participants in 16 cohorts (including 2433 current, 6518 former, and 6956 never smokers). Comparing current versus never smokers, 2623 cytosine-phosphate-guanine sites (CpGs), annotated to 1405 genes, were statistically significantly differentially methylated at Bonferroni threshold of P<1×107^{-7} (18 760 CpGs at false discovery rate <0.05). Genes annotated to these CpGs were enriched for associations with several smoking-related traits in genome-wide studies including pulmonary function, cancers, inflammatory diseases, and heart disease. Comparing former versus never smokers, 185 of the CpGs that differed between current and never smokers were significant P<1×107^{-7} (2623 CpGs at false discovery rate <0.05), indicating a pattern of persistent altered methylation, with attenuation, after smoking cessation. Transcriptomic integration identified effects on gene expression at many differentially methylated CpGs. CONCLUSIONS: Cigarette smoking has a broad impact on genome-wide methylation that, at many loci, persists many years after smoking cessation. Many of the differentially methylated genes were novel genes with respect to biological effects of smoking and might represent therapeutic targets for prevention or treatment of tobacco-related diseases. Methylation at these sites could also serve as sensitive and stable biomarkers of lifetime exposure to tobacco smoke.Biotechnology and Biological Sciences Research Council, British Heart Foundation, Cancer Research UK, Medical Research Council, National Institutes of Health, Royal Society, Wellcome Trus

    Polygenic Risk Scores for Prediction of Breast Cancer and Breast Cancer Subtypes.

    Get PDF
    Stratification of women according to their risk of breast cancer based on polygenic risk scores (PRSs) could improve screening and prevention strategies. Our aim was to develop PRSs, optimized for prediction of estrogen receptor (ER)-specific disease, from the largest available genome-wide association dataset and to empirically validate the PRSs in prospective studies. The development dataset comprised 94,075 case subjects and 75,017 control subjects of European ancestry from 69 studies, divided into training and validation sets. Samples were genotyped using genome-wide arrays, and single-nucleotide polymorphisms (SNPs) were selected by stepwise regression or lasso penalized regression. The best performing PRSs were validated in an independent test set comprising 11,428 case subjects and 18,323 control subjects from 10 prospective studies and 190,040 women from UK Biobank (3,215 incident breast cancers). For the best PRSs (313 SNPs), the odds ratio for overall disease per 1 standard deviation in ten prospective studies was 1.61 (95%CI: 1.57-1.65) with area under receiver-operator curve (AUC) = 0.630 (95%CI: 0.628-0.651). The lifetime risk of overall breast cancer in the top centile of the PRSs was 32.6%. Compared with women in the middle quintile, those in the highest 1% of risk had 4.37- and 2.78-fold risks, and those in the lowest 1% of risk had 0.16- and 0.27-fold risks, of developing ER-positive and ER-negative disease, respectively. Goodness-of-fit tests indicated that this PRS was well calibrated and predicts disease risk accurately in the tails of the distribution. This PRS is a powerful and reliable predictor of breast cancer risk that may improve breast cancer prevention programs

    Associations of obesity and circulating insulin and glucose with breast cancer risk: a Mendelian randomization analysis.

    Get PDF
    BACKGROUND: In addition to the established association between general obesity and breast cancer risk, central obesity and circulating fasting insulin and glucose have been linked to the development of this common malignancy. Findings from previous studies, however, have been inconsistent, and the nature of the associations is unclear. METHODS: We conducted Mendelian randomization analyses to evaluate the association of breast cancer risk, using genetic instruments, with fasting insulin, fasting glucose, 2-h glucose, body mass index (BMI) and BMI-adjusted waist-hip-ratio (WHRadj BMI). We first confirmed the association of these instruments with type 2 diabetes risk in a large diabetes genome-wide association study consortium. We then investigated their associations with breast cancer risk using individual-level data obtained from 98 842 cases and 83 464 controls of European descent in the Breast Cancer Association Consortium. RESULTS: All sets of instruments were associated with risk of type 2 diabetes. Associations with breast cancer risk were found for genetically predicted fasting insulin [odds ratio (OR) = 1.71 per standard deviation (SD) increase, 95% confidence interval (CI) = 1.26-2.31, p  =  5.09  ×  10-4], 2-h glucose (OR = 1.80 per SD increase, 95% CI = 1.3 0-2.49, p  =  4.02  ×  10-4), BMI (OR = 0.70 per 5-unit increase, 95% CI = 0.65-0.76, p  =  5.05  ×  10-19) and WHRadj BMI (OR = 0.85, 95% CI = 0.79-0.91, p  =  9.22  ×  10-6). Stratified analyses showed that genetically predicted fasting insulin was more closely related to risk of estrogen-receptor [ER]-positive cancer, whereas the associations with instruments of 2-h glucose, BMI and WHRadj BMI were consistent regardless of age, menopausal status, estrogen receptor status and family history of breast cancer. CONCLUSIONS: We confirmed the previously reported inverse association of genetically predicted BMI with breast cancer risk, and showed a positive association of genetically predicted fasting insulin and 2-h glucose and an inverse association of WHRadj BMI with breast cancer risk. Our study suggests that genetically determined obesity and glucose/insulin-related traits have an important role in the aetiology of breast cancer

    Association analysis identifies 65 new breast cancer risk loci

    Get PDF
    Breast cancer risk is influenced by rare coding variants in susceptibility genes, such as BRCA1, and many common, mostly non-coding variants. However, much of the genetic contribution to breast cancer risk remains unknown. Here we report the results of a genome-wide association study of breast cancer in 122,977 cases and 105,974 controls of European ancestry and 14,068 cases and 13,104 controls of East Asian ancestry. We identified 65 new loci that are associated with overall breast cancer risk at P < 5 × 10-8. The majority of credible risk single-nucleotide polymorphisms in these loci fall in distal regulatory elements, and by integrating in silico data to predict target genes in breast cells at each locus, we demonstrate a strong overlap between candidate target genes and somatic driver genes in breast tumours. We also find that heritability of breast cancer due to all single-nucleotide polymorphisms in regulatory features was 2-5-fold enriched relative to the genome-wide average, with strong enrichment for particular transcription factor binding sites. These results provide further insight into genetic susceptibility to breast cancer and will improve the use of genetic risk scores for individualized screening and prevention.We thank all the individuals who took part in these studies and all the researchers, clinicians, technicians and administrative staff who have enabled this work to be carried out. Genotyping of the OncoArray was principally funded from three sources: the PERSPECTIVE project, funded by the Government of Canada through Genome Canada and the Canadian Institutes of Health Research, the ‘Ministère de l’Économie, de la Science et de l’Innovation du Québec’ through Genome Québec, and the Quebec Breast Cancer Foundation; the NCI Genetic Associations and Mechanisms in Oncology (GAME-ON) initiative and Discovery, Biology and Risk of Inherited Variants in Breast Cancer (DRIVE) project (NIH Grants U19 CA148065 and X01HG007492); and Cancer Research UK (C1287/A10118 and C1287/A16563). BCAC is funded by Cancer Research UK (C1287/A16563), by the European Community’s Seventh Framework Programme under grant agreement 223175 (HEALTH-F2-2009-223175) (COGS) and by the European Union’s Horizon 2020 Research and Innovation Programme under grant agreements 633784 (B-CAST) and 634935 (BRIDGES). Genotyping of the iCOGS array was funded by the European Union (HEALTH-F2-2009-223175), Cancer Research UK (C1287/A10710), the Canadian Institutes of Health Research for the ‘CIHR Team in Familial Risks of Breast Cancer’ program, and the Ministry of Economic Development, Innovation and Export Trade of Quebec, grant PSR-SIIRI-701. Combining of the GWAS data was supported in part by The National Institute of Health (NIH) Cancer Post-Cancer GWAS initiative grant U19 CA 148065 (DRIVE, part of the GAME-ON initiative)
    corecore